Adding columns and puting value in a new row in R Script - r

I am trying to add particular columns of data frame and adding these values in a new row of the same data frame.
TCM<-colSums(df[3:16])--this add all the values
Now in the same file "TCM" I want to have new added values at the last row named Total.

You will need the last row to be of the same length with your other rows if you want to bind them. I can write the code assuming you want the first column to be "Total" and second column to be NA. If you have different values in mind simply modify the inputs for the respective values.
first_val = "Total"
second_val = NA
to_bind = c(first_val,second_val,TCM)
df = rbind(df,to_bind)

Related

Checking for specific set of values in a data frame column in R

I have a dataframe that looks something like the following.
KEYCODE= c("A","B","C","D")
values = list(c("12,3"),c("2"),c("2,7"),c("16"))
df = data.frame(KEYCODE=KEYCODE, values=cbind(values))
Now, I want to create a third column that will create values of 0 when the values field do not have anything between 7 to 18, else it will crate a value of 1. For instance,
My new df with the new column
Unfortunately, the elements in the values field are strings.
Thank in advance.

Is there a R methodology to select the columns from a dataframe that are listed in a separate array

I have a dataframe with over 100 columns. Post implementation of certain conditions, I need a subset of the dataframe with the columns that are listed in a separate array.
The array has 50 entries with 2 columns. The first column has the selected variable names and the second column has some associated values.
I wish to build a new data frame with just the variables mentioned in the the first column of the separate array. Could you please point me as to how to proceed?
Try this:
library(dplyr)
iris <- iris %>% select(contains(dataframe_with_names$names))
In R you can use square brackets [rows, columns] to select specific rows or specific columns. (Leaving either blank selects all).
If you had a vector of column names you wanted to keep called important_columns you could select only those columns with:
myData[,important_columns]
In your case the vector of column names is actually a column in your array. So you select that column and use it as your vector:
myData[, array$names]

field names repeated in output from loop to calculate new fields in R data frame

I'm using a for loop to create a set of new columns in an R dataframe but the in the output the original columns are duplicated, with the addition of the dataframe name as a suffix, and the new columns also have this suffix, which I don't want. I simply want the new output to be the same as the original dataframe, but with a set of new columns containing the new calculations. How do I achieve this? Details below:
These are the columns of the original dataframe:
Area; SR_2005;SR_2006;SR_2007;SR_2008;xnull_SR_2005;xnull_SR_2006;xnull_SR_2007;xnull_SR_2008
I then wanted to add a series of new fields to this dataframe, where each ‘SR’ column was divided by its corresponding ‘xnull_SR’ column (e.g. SR_2005/ xnull_SR_2005); each of these new fields would be prefixed with “p_”, e.g. “p_2005”). Here is the code I've used:
for (j in 2005:2019)
{field = paste("p_", j, sep = "")
restab1 <- within(restab1, restab1[[field]] <- get(paste("SR_",j, sep = ""))/ get(paste("xnull_SR_",j, sep = "")))
}
What I hoped for is that I would just get the original data fields with the new fields (“p_2005”, “p_2006” etc) added. Instead of this I do indeed get the new fields, but they are all
prefixed with the name of the dataframe (e.g. restab1.p_2005) and as well as that the original fields are repeated, once just with the field name (e.g. “SR_2005”) and once with the dataframe prefix (e.g. “restab1.SR_2005”). Therefore, these are the field names in the changed dataframe:
area SR_2005 SR_2006 SR_2007 SR_2008 xnull_SR_2005 xnull_SR_2006 xnull_SR_2007 xnull_SR_2008 restab1.area restab1.SR_2005 restab1.SR_2006 restab1.SR_2007 restab1.SR_2008 restab1.xnull_SR_2005 restab1.xnull_SR_2006 restab1.xnull_SR_2007 restab1.xnull_SR_2008 restab1.p_2005 restab1.p_2006 restab1.p_2007 restab1.p_2008
The calculations in the new fields (restab1.p_2005 restab1.p_2006 etc.) are correct but I just want the dataframe to contain the old and new field names once, and without the "restab1" prefix. How do I achieve this?
Consider simple division across multiple columns since data frames ensure columns have the same dimensions.
restab1[paste0("p_", 2005:2008)] <- restab1[paste0("SR_", 2005:2008)] / restab1[paste0("xnull_SR_", 2005:2008)]

Unique function leaves out column

Say I use the UNIQUE function of R to create a script to pull out particular columns of a premade dataframe to make a new one:
SUPSCIARIDS<-unique(SuperiorSciarids[,c(36,2,3,4:34)])
36-LOGID
2-Decay
3-Diameter
4:34 are the species
Why would it be do you think the new data frame does not show column 2?
I've found the answer.....that annoying dummy column "row.names" that often appears before column one.
I was considering row.names as the first column in my matrix, thus if I add 1 to the beginning of the column query it brings up the Decay variable in my unique table, BINGO! and how embarassing......>
SUPSCIARIDS<-unique(SuperiorSciarids[,c(35,1,2,3,4:34)])

Replacing select columns in a data frame with new columns of equal length

I have a data frame which I will call "abs.data" that contains 265 columns (variables). I have another data frame which I will call "corr.abs" that contains updated data on a subset of the columns in "abs.data". Both data frames have an equal number of rows, n=551. I need to replace the columns in "abs.data" with the correct observations in "corr.abs" where the column names match. I have tried the following
abs.samps <- colnames(abs.data) #vector of column names in abs. data
corr.abs.samps <- colnames(corr.abs) #vector of column names in corr.abs
abs.data[,which(abs.samps %in% corr.abs.samps==TRUE)] <- corr.abs[,which(corr.abs.samps %in% abs.samps==TRUE)] #replace columns in abs.data with correct observations in corr.abs where the column names are the same
When I run the left and right side of the last line of code R pulls the right columns, but it fails to replace the columns in abs.data with the correct data in corr.abs. Any ideas why?
you can find the common column names using
comm_col <- intersect(colnames(abs.samps), colnames(corr.abs))
eg. you find X2 is the common column
you can first drop the columns, in this case X2 from abs.samps that you do not want using subset
x<-subset(abs.samps, select = -X2)
then you can just add the new column (eg. column name X2)to the new data frame
y<-cbind(corr.abs$X2,x)

Resources