DataFrame nrows null after adding a column - r

I am using DataFrames in a C++ code that I use in a R package. At some point, I want to add a column of ones at the front of my DataFrame. I do so as follow, where df is my DataFrame:
df.push_front(NumericVector(df.nrows(), 1.0), "(Intercept)");
After this, I expect df to contain its column plus a new one full one ones at the beginning. And it seems to be doing that.
But the problem is that now df.nrows() is equal to zero, and not what it used to be. Am I missing something?

Related

Replacing df values in a column with values from anothe df via key

I need to replace values in the Nth column of my df, call these values v1s, by some other values from anothe df, call them v2s. There is a dictionary, or ruther two dictionaries. The first one translates v1s into numbers, the second one translates the numbers into v2s. I tried merge(), left/right_join(), smth else...but nothing seems to work. Can somebody help please?
Merging the datasets should work. Try the code until you can make it work.
Otherwise, you can always simply add an extra column to your dataset with
datasetA$newvar <- datasetB$v2s
when you have correctly added the second variable, simply drop the first.

Adding new column with data value in R

forest area to the I want to add a column name (say ForestAreaPerPopn) to find the ratio of forest area to the population(represented by variable Total below) residing. The data contains the following variables and their values.
How can I add a column named ForestAreaPerPopn in Table****ForestAreaPerPop (shown below) so that the column contains the data calculated as ratio of forest area to Total.
Too long for a comment.
You have a couple of problems. First, your column names have spaces and other special characters. This is allowed but creates all kinds of problems later. I suggest you do something like:
colnames(ForestAreaPerPop) <- gsub(' |\\(|\\)', '_', colnames(ForestAreaPerPop))
This will replaces any spaces, left or right parens in the colnames with '_'.
Then, something like:
ForestAreaPerPop$n <- with(ForestAreaPerPop, Forest_Area_in_ha/Total)
should give you what you want.
Some advice: long table names and column names may seem like a good idea, but you will live to regret it. Make them short but meaningful (easier said than done).

Combine lapply and gsub to replace a list of values for another list of values

I am currently looking for a way to simplify searching through a column within a dataframe for a vector of values and replacing each of of those values with another value (also contained within a separate vector). I can run a for loop for this, but it must be possible within the apply family, I'm just not seeing it yet. Very new to using the apply family and could use help.
So far, I've been able to have it replace all instances of the first value in my vector with the new first value in the new vector, it just isn't iterating past the first level. I hope this makes sense. Here is the code I have:
#standardize tank location
old_tank_list <- c("7.C.4","7.C.5","7.C.6","7.C.7","7.C.8","7.C.9","7.C.10","7.C.11")
new_tank_list <- c("7.B.3-4","7.C.3-4","7.C.1-2","7.C.5-6","7.C.7-8","7.C.9-10","7.E.9-10","7.C.11-12")
sapply(df_growth$Tank,function(y) gsub(old_tank_list,std_tank_list,y))
Tank is the name of the column I am trying to replace all of these values within. I haven't assigned it back yet, because I want to test the functionality first. Thanks for any help you can offer.
Hopefully, this image will help. The photo on the left is the column before my function is applied. The column on the right is after. Basically, I just want to batch change text values.
Before and After
library(dplyr)
df %>%
mutate(Tank = recode(Tank, !!!setNames(new_tank_list, old_tank_list)))

Why does R think my imported vector of characters are numbers?

This is probably a basic question, but why does R think my vector, which has a bunch of words in it, are numbers when I try to use these vectors as column names?
I imported a data set and it turns out the first row of data are the column headers that I want. The column headers that came with the data set are wrong ones. So I want to replace the column names. I figured this should be easy.
So what I did was I extracted the first row of data into a new object:
names <- data[1,]
Then I deleted the first row of data:
data <- data[-1,]
Then I tried to rename the column headers with the "names" object:
colnames(data) <- names
However, when I do this, instead of changing my column names to the words within the names object, it turns it into a bunch of numbers. I have no idea where these numbers come from.
Thanks
You need to actually show us the data, and the read.csv()/read.table() command you used to import.
If R thinks your numeric column is string, it sounds like that's because it wrongly includes the column name, i.e. you omitted header=TRUE in your read.csv()/read.table() import.
But show us your actual data and commands used.

Changing hundreds of column names simultaneously in R

I have a data frame with hundreds of columns whose names I want to change. I'm very new to R, so it's rather easy to think through the logic of this, but I simply can't find a relevant example online.
The closest I could sort of get was this:
projectFileAllCombinedNames <- for (i in 1:200){names(projectFileAllCombined)[i+1] <-variableNames[i]}
Basically, starting at the second column of projectFileAllCombined, I want to loop through the columns in the dataframe and assign them the data values in the second data frame. I was able to change one column name manually with this code:
colnames(projectFileAllCombined)[2]<-"newColumnName"
but I can't possibly do that for hundreds of columns. I've spent multiple hours on this and can't crack it with any number of Google searches on "change multiple columns in r" or "change column names in r". The best I can find online is examples where people change a few columns with a c() function and I get how that works, but that still seems to require typing out all the column names as parameters to the function, unless there is a way to just pass the "variableNames" file into that c() function, but I don't know of one.
Will
colnames(projectFileAllCombined)[-1] <- variableNames
not suffice?
This assumes the ordering of columns in projectFileAllCombined is the same as the ordering of the new variable names in variableNames, and that
length(variableNames) == (ncol(projectFileAllCombined) - 1)
The key point here is that the replacement function 'colnames<-'() is vectorised and can replace any number of column names in a single call if passed a vector of replacement values.

Resources