I am working on data set for analysis.
I have to remove column name of single variable of data frame in r.
I have used colnames(df)<-NULL function,it removes all colnames of dataframe
The excepted output is:
I don't really get the expected output, but maybe this is what you're looking for
colnames(df)[i] <- NA
'i' is the column location (for example 1,2,3 etc)
Related
I am trying to work with the below data frame and what I am trying to do is to compare the values from columns:
#create a sample data frame
df<-data.frame(
item=c("a","b","c","d"),
price_today=c(1,2,3,4,5),
price_yesterday=c(1,2,3,4,5)
If values from column price_today are the same compared with values from column price_yesterday, then print okay or else print not okay, and the printed result will be shown as a new variable in the data frame.
May I know how should I go about the ifelse part here?
Many thanks for your help and have a good day.
Modified Questions:
Hi all, so now what if the df becomes like this:
#create a sample data frame (modified)
df<-data.frame(
item=c("a","a","c","d"),
price_today=c(1,"",3,"XYZ",5),
price_yesterday=c(1,2,3,4,5)
Now it contains both blank value and non-numerical values in column price_today. And instead of a,b,c,d in column item, it becomes a,a,c,d in column item. I have been trying to do the following:
Sort column item by "a" and I have the below code:
df_1<-df[df$item=="a",]
After df_1 is filtered, then again sort price_today, by removing blank and non-numerical values with codes below:
df_1<-df[!is.numeric(df_1$price_today),]
I am able to filter out by "a" in column item, however, with the second filter, it then returns with the original df, may I know what did I do wrong here?
Million thanks for your help and have a good day/night.
My column titles are not correct. I want to rename all my column of a matrix (because i have v1,v2,v3,..) according to data frame (the name of the first column corresponds to the first title of my data frame). I have to repeat this for my 39 columns. So the goal would be to do a for-loop.
df1 is the matrix that has to be changed.
for (i in 1:39) {
names(df1[,i]) <- names(dfnorm[,i])
}
This code is not working.
If I understand correctly, this should work:
names(df1)[1:39] <- names(dfnorm)[1:39]
I would like to assign names to rows in R but so far I have only found ways to assign names to columns. My data is in two columns where the first column (geo) is assigned with the name of the specific location I'm investigating and the second column (skada) is the observed value at that specific location. To clarify, I want to be able to assign names for every location instead of just having them all in one .txt file so that the data is easier to work with. Anyone with more experience than me that knows how to handle this in R?
First you need to import the data to your global environment. Try the function read.table()
To name rows, try
(assuming your data.frame is named df):
rownames(df) <- df[, "geo"]
df <- df[, -1]
Well, your question is not that clear...
I assume you are trying to create a data.frame with named rows. If you look at the data.frame help you can see the parameter row.names description
NULL or a single integer or character string specifying a column to be used as row names, or a character or integer vector giving the row names for the data frame.
which means you can manually specify the row names when you create the data.frame or the column containing the names. The former can be achived as follows
d = data.frame(x=rnorm(10), # 10 random data normally distributed
y=rnorm(10), # 10 random data normally distributed
row.names=letters[1:10] # take the first 10 letters and use them as row header
)
while the latter is
d = data.frame(x=rnorm(10), # 10 random data normally distributed
y=rnorm(10), # 10 random data normally distributed
r=letters[1:10], # take the first 10 letters
row.names=3 # the column with the row headers is the 3rd
)
If you are reading the data from a file I will assume you are using the command read.table. Many of its parameters are the same of data.frame, in particular you will find that the row.headers parameter works the same way:
a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.
Finally, if you have already read the data.frame and you want to change the row names, Pierre's answer is your solution
Please pardon my ignorance ,Since i am beginner in R
In R i am transposing a dataframe (certain rows to columns) and saving the result back to a data frame which is exactly what i need .
But the column name for the 1st column is missing which i need to join it with other data frames .
Data frame result and function used
dish_pair<-as.data.frame.matrix(xtabs(count~primary_id+subcategory_name, dishes))
output
But How can i get the 1st column name as primary_id
which are holding row values 50792 ,50793
(I just need the 1st column name value as primary_id ,renaming data frame values are correct)
The first column on the picture is just the row names. You need to add it to the data frame and give the column a name:
dish_pair <- data.frame(primary_id=rownames(dish_pair), dish_pair)
I have a data frame (originally from a CSV file) with the columns NAME and YEAR. I have extracted a sample from this data frame of the first ten entries like so:
sample<-df(1:10,)
I want to know the frequency of the values in the NAME column so I input the following:
as.data.frame(table(sample$NAME))
This counts the frequency in the sample correctly but also includes every name from the original data frame in the 'Var1' column (all with a Freq of 0).
The same thing happens if I use unique(sample$NAME) as well: it lists the names from the sample along with all of the names from the original data frame as well.
What am I doing wrong?
This could be a case of unused level in the 'NAME' factor column. We can use droplevels or call factor again to remove those unused levels.
as.data.frame(table(droplevels(sample$NAME)))
Or
as.data.frame(table(factor(sample$NAME)))