Call a specific column name in R - r

colnames gives me the column names for a whole dataframe. Is there any way to get the name of one specified column. i would need this for naming labels when plotting data in ggplot.
So say my data is like this:
df1 <- data.frame(a=sample(1:50,10), b=sample(1:50,10), c=sample(1:50,10))
I would need something like paste(colnames(df1[,1])) which obviously won't work.
any ideas?

you call the name like this:
colnames(df1)[1]
# i.e. call the first element of colnames not colnames of the first vector
however by removing your comma e.g.:
colnames(df1[1])
you can also call the names, becauseusing only [x] not [,x] or [[x]] keeps the data.frame structure not reducing to a vector unlike $x and [,x]

names(df1)[1]
will give you the name of the first column. So too will
names(df1[1])
Neither uses a comma.

Would colnames(df1)[1] solve the problem?

Related

names above the numbers of a numerical vector in R. how to change them?

in R there were several times on which a numerical vector had names before each numeric value as this:
class(oral_NO_AR_comp$clustering$clust1)
output: 1 "numeric"
and the content looks like this:
THe point here is that I need to change the names of the strings above the numbers, is there a way to do that?r
You can get those names with
names(oral_NO_AR_comp$clustering$clust1).
You can use
names(oral_NO_AR_comp$clustering$clust1)<- <whatever you want>
# or
setNames(oral_NO_AR_comp$clustering$clust1, <whatever you want)
to change the names if you like. You can also use remove the names with
unname(oral_NO_AR_comp$clustering$clust1)
Note that these functions (with the exception of names<-) do not change the original value, they return a new value. If you want to replace the original value, be sure to assign it <- to the original variable.

improving specific code efficiency - *base R* alternative to for() loop solution

Looking for a vectorized base R solution for my own edification. I'm assigning a value to a column in a data frame based on a value in another column in the data frame.
My solution creates a named vector of possible codes, looks up the code in the original column, subsets the named list by the value found, and assigns the resulting name to the new column. I'm sure there's a way to do exactly this using the named vector I created that doesn't need a for loop; is it some version of apply?
dplyr is great and useful and I'm not looking for a solution that uses it.
# reference vector for assigning more readable text to this table
tempAssessmentCodes <- setNames(c(600,301,302,601,303,304,602,305,306,603,307,308,604,309,310,605,311,312,606,699),
c("base","3m","6m","6m","9m","12m","12m","15m","18m","18m","21m","24m","24m","27m","30m","30m",
"33m","36m","36m","disch"))
for(i in 1:nrow(rawDisp)){
rawDisp$assessText[i] <- names(tempAssessmentCodes)[tempAssessmentCodes==rawDisp$assessment[i]]
}
The standard way is to use match():
rawDisp$assessText <- names(tempAssessmentCodes)[match(rawDisp$assessment, tempAssessmentCodes)]
For each y element match(x, y) will find a corresponding element index in x. Then we use the names of y for replacing values with names.
Personally, I do it the opposite way - make tempAssesmentCodes have names that correspond to old codes, and values correspond to new codes:
codes <- setNames(names(tempAssessmentCodes), tempAssessmentCodes)
Then simply select elements from the new codes using the names (old codes):
rawDisp$assessText <- codes[as.character(rawDisp$assessment)]

Remove multiple rows from a list of names in R (a list of 187 names to remove)?

I have a data frame in R containing over 29,000 rows. I need to remove multiple rows using only a list of names (187 names).
My dataset is about airlines, and I need to remove specific airlines from my data set that contains over 200 types of airlines. My first column contains all airline names, and I need to remove the entire row for those specific airlines.
I singled out all airline names that I want removed by this code: transmute(a_name_remove, airline_name). This gave me a table of all names of airlines that I want removed, now I have to remove that list of names from my original dataset named airlines.
I know there is a way to do this manually, which is: mydata[-c("a", "b"), ], for example. But writing out each name would be hectic.
Can you please help me by giving me a way to use the list that I have to forwardly remove those rows from my dataset?
I cannot write out each name on its own.
I also tried this: airlines[!(row.names(airlines) %in% c(remove)), ], in which I made my list "removed" into a data frame and as a vector, then used that code to remove it from my original dataset "airlines", still did not work.
Thank you!
You can create a function that negates %in%, e.g.
'%not_in%' <- Negate('%in%')
so per your code, it should look like this
airlines[row.names(airlines) %not_in% remove, ]
additionally, I do not recommend using remove as a variable name, since it is a base function in R, if possible rename the variable, e.g. discard_airlines ,
airlines[row.names(airlines) %not_in% discard_airlines, ]

How to access a column after subsetting data frame?

It has to be really simple but it looks like my mind is not working properly anymore.
So, what I would like to do is to store one of the columns from mtcars as a vector but after subsetting it. I need one line code for the subsetting and assigning a vector.
That's what I would like to achieve but with one line:
data <- mtcars[mtcars[,11]==4,]
vec <- data[,1]
Thx!
vec<-mtcars[mtcars[,11]==4,][,1]
The mtcars[,11]==4 would be the row index and by selecting the column index as '1', we get the first column with subset of rows based on the condition.
mtcars[mtcars[,11]==4, 1]

How should I apply the same formatting to a list of dataframes in R?

Here is what I've done so far. So, that's basically grabbing some tables off the internet using XML, putting them into a list of dataframes and then some mess trying (and failing) to format them in an efficient and consistent way.
I can't work out how to apply the same changes to all of the dataframes. I think I need to use llply, but I can't get it right. Overall I am trying to achieve:
Column names all legitimate R names using make.names, then use the
str_replace_all towards the end of the file to strip all non-alpha
characters so the names are the same
Next I want to remove all but the first four columns from all of the dataframes
Then I want to add a column with the title for each book. I guess I'll have to do this manually.
Finally, I want to do an rbind to join all of the dataframes together
What's really got me stumped is how to apply the same transformations to each dataframe in the list such as modifying their column names and cutting off rows. Is llply the right tool for the job? How do I use it?
So far the most I've been able to achieve is turning my list of dataframes into a list of vectors with the right names. I believe this is because when I tried using names() it returned the vector of correct names, rather than a dataframe with the correct names. This was my attempt:
tlist <- llply(tabs, function(x) as.data.frame(str_replace_all(make.names(names(x)), "[^[:alpha:]]", "")))
I don't think I'm a million miles away here, but I can't think how to get it to return the full df.
Use this instead:
f <- function(x)
{
y <- x[,1:4]
names(y) <- str_replace_all(make.names(names(y)), "[^[:alpha:]]", "")
y
}
result <- rbind.fill(llply(tabs, f))
EDIT: following #baptiste, this may be better:
result <- ldply(tabs, f)

Resources