Removing first character from multiple column names - r

My question is about renaming multiple column names at once.
I have a dataframe called 'growth' with 46 columns.
Columns 2:46 are all named as dates, but all of the dates have an X in front of them, e.g. 'X1981'.
Naturally I want to remove the X from all of the column names.
I cannot understand why the following is not working:
colnames(growth[ ,2:length(growth)]) <- substring(colnames(growth[ ,2:length(growth)]),2)
Please help me with some insights.

Nevermind, I changed the instruction to...
names(growth)[2:46] <- substring(names(growth)[2:46],2)
...and now it works. Clearly it had something to do with how I was subsetting the columns.

Related

Can not merge data with another dataset

I have a dataset that is like this: list
df
200000
5666666
This dataset continues to 5551
Another dataset has also 5551 observations. I want to merge list dataset with another dataset. But no variable is the same. Just row names are the same.
I gave that
merge(list,df,by="rownames")
The error message is that it should have a valid column name
I tried also merge_all but not work
It is not working? Could someone please help
It's good practice to be more precise with the naming of your dataframe variables. I wouldn't use list but something like df_description. Either way, merging by rownames can be achieved by using by = "row.names" or by = 0. You can read more on merge() in the documentation (under "Details").

Assigning Unnamed Columns To Another DataFrame

I'm in a very basic class that introduces R for genetic purposes. I'm encountering a rather peculiar problem in trying to follow the instructions given. Here is what I have along with the instructor's notes:
MangrovesRaw<-read.csv("C:/Users/esteb/Documents/PopGen/MangrovesSites.csv")
#i'm going to make a new dataframe now, with one column more than the mangrovesraw dataframe but the same number of rows.
View(MangrovesRaw)
Mangroves<-data.frame(matrix(nrow = 528, ncol = 23))
#next I want you to name the first column of Mangroves "pop"
colnames(Mangroves)<-c(col1="pop")
#i'm now assigning all values of that column to be 1
Mangroves$pop<-1
#assign the rest of the columns (2 to 23) to the entirety of the MangrovesRaw dataframe
#then change the names to match the mangroves raw names
colnames(Mangroves)[2:23]<-colnames(MangrovesRaw)
I'm not really sure how to assign columns that haven't been named used the $ as we have in the past. A friend suggested I first run
colnames(Mangroves)[2:23]<-colnames(MangrovesRaw)
Mangroves$X338<-MangrovesRaw
#X338 is the name of the first column from MangrovesRaw
But while this does transfer the data from MangrovesRaw, it comes at the cost of having my column names messed up with X338. added to every subsequent column. In an attempt to modify this I found the following "fix"
colnames(Mangroves)[2:23]<-colnames(MangrovesRaw)
Mangroves$X338<-MangrovesRaw[,2]
#Mangroves$X338<-MangrovesRaw[,2:22]
#MangrovesRaw has 22 columns in total
While this transferred all the data I needed for the X338 Column, it didn't transfer any data for the remaining 21 columns. The code in # just results in the same problem of having X388. show up in all my column names.
What am I doing wrong?
There are a few ways to solve this problem. It may be that your instructor wants it done a certain way, but here's one simple solution: just cbind() the Mangroves$pop column with the real data. Then the data and column names are already added.
Mangroves <- cbind(Mangroves$pop, MangrovesRaw)
Here's another way:
Mangroves[, 2:23] <- MangrovesRaw
colnames(Mangroves)[2:23] <- colnames(MangrovesRaw)

R matrix and data.frame mix-up

I have a data.frame in R with columns that also have column names.
I have another data.frame with 0s and -1s that controls which columns to use from the first data.frame in a subsequent analysis.
I now ran into an issue that I cannot wrap my head around.
First of all, the "offending" line of code is:
covar.data<-covar.data[,!onoff]
FYI I have confirmed both covar.data and onoff are data.frames.
When I run this with onoff selecting 2 or more columns, everything is fine, and the resulting covar.data is still a data.frame - and this is important, because I need to use the column names in the rest of my analysis.
However, if I have onoff selecting only 1 column, covar.data turns into a matrix!! This is a problem, because the column name also disappears!
I tried
covar.data<-as.data.frame(covar.data[,!onoff])
and
covar.data<-as.data.frame(covar.data[,!onoff], col.names=TRUE)
but that didn't make a difference in the disappearance of the column name.
I don't understand why R decides to turn the data.frame into a matrix (only for the times I am left with one column), and I cannot figure out how to preserve the data.frame PLUS the column names.
If you select a single column of a data.frame, R assumes you want to extract that data as a vector rather than returning another data.frame (and in most cases this is exactly the behavior you want). But if you do want to keep that single column as a data.frame, then you should do
covar.data[,!onoff, drop=F]

how to remove rows from dataframe in r (different situation)

So I want to remove the first row in the data frame. my code is like
reddot.info<-reddot.info[-1,]
But then i find i can not view this data frame. i figure out the reason. because when i run code
reddot.info[1,]
if appears as
Num Product_names URL
1 NA Product Names URL
which means that if i use this code i will also remove the column names.
so what should i do to remove the first row in stead of removing the column names and first row together.
Thank you so much.
You probably don't have column names to begin with because removing the first row like that doesn't remove the column names.
colnames(reddot.info) <- reddot.info[1,]
reddot.info <- reddot.info[-1,]

Change part of data frame to character, then to numeric in R

I have a simple problem. I have a data frame with 121 columns. columns 9:121 need to be numeric, but when imported into R, they are a mixture of numeric and integers and factors. Columns 1:8 need to remain characters.
I’ve seen some people use loops, and others use apply(). What do you think is the most elegant way of doing this?
Thanks very much,
Paul M
Try the following... The apply function allows you to loop over either rows, cols, or both, of a dataframe and apply any function, so to make sure all your columns from 9:121 are numeric, you can do the following:
table[,9:121] <- apply(table[,9:121],2, function(x) as.numeric(as.character(x)))
table[,1:8] <- apply(table[,1:8], 2, as.character)
Where table is the dataframe you read into R.
Briefly I specify in the apply function the table I want to loop over - in this case the subset of your table we want to make changes to, then we specify the number 2 to indicate columns, and finally give the name of the as.numeric or as.character functions. The assignment operator then replaces the old values in your table with the new ones of correct format.
-EDIT: Just changed the first line as I recalled that if you convert from a factor to a number, what you get is the integer of the factor level and not the number you think you are getting to factors first need to be converted to characters, then numbers, which was can do just by wrapping as.character inside as.numeric.
When you read in the table use strinsAsFactors=FALSE then there will not be any factors.

Resources