How to name a new dataframe based on input character value - r

In R, I am trying to get input from a user to create the name of a new data frame. e.g
number <- readline(prompt = "what is your number:")
Which creates a character string with one entry, e.g number: "4"
Now i want to create a dataframe named after the character inputted, and subset some other information based on that number from another table, for example:
number_4 <- subset(df, df$NO=="4")
As i might be doing hundreds of these i do not want to have to manually name each dataframe, is there a way to use the character to name a dataframe?

We can use assign function
assign(paste0("number_", number), subset(df, NO == number))

Related

field names repeated in output from loop to calculate new fields in R data frame

I'm using a for loop to create a set of new columns in an R dataframe but the in the output the original columns are duplicated, with the addition of the dataframe name as a suffix, and the new columns also have this suffix, which I don't want. I simply want the new output to be the same as the original dataframe, but with a set of new columns containing the new calculations. How do I achieve this? Details below:
These are the columns of the original dataframe:
Area; SR_2005;SR_2006;SR_2007;SR_2008;xnull_SR_2005;xnull_SR_2006;xnull_SR_2007;xnull_SR_2008
I then wanted to add a series of new fields to this dataframe, where each ‘SR’ column was divided by its corresponding ‘xnull_SR’ column (e.g. SR_2005/ xnull_SR_2005); each of these new fields would be prefixed with “p_”, e.g. “p_2005”). Here is the code I've used:
for (j in 2005:2019)
{field = paste("p_", j, sep = "")
restab1 <- within(restab1, restab1[[field]] <- get(paste("SR_",j, sep = ""))/ get(paste("xnull_SR_",j, sep = "")))
}
What I hoped for is that I would just get the original data fields with the new fields (“p_2005”, “p_2006” etc) added. Instead of this I do indeed get the new fields, but they are all
prefixed with the name of the dataframe (e.g. restab1.p_2005) and as well as that the original fields are repeated, once just with the field name (e.g. “SR_2005”) and once with the dataframe prefix (e.g. “restab1.SR_2005”). Therefore, these are the field names in the changed dataframe:
area SR_2005 SR_2006 SR_2007 SR_2008 xnull_SR_2005 xnull_SR_2006 xnull_SR_2007 xnull_SR_2008 restab1.area restab1.SR_2005 restab1.SR_2006 restab1.SR_2007 restab1.SR_2008 restab1.xnull_SR_2005 restab1.xnull_SR_2006 restab1.xnull_SR_2007 restab1.xnull_SR_2008 restab1.p_2005 restab1.p_2006 restab1.p_2007 restab1.p_2008
The calculations in the new fields (restab1.p_2005 restab1.p_2006 etc.) are correct but I just want the dataframe to contain the old and new field names once, and without the "restab1" prefix. How do I achieve this?
Consider simple division across multiple columns since data frames ensure columns have the same dimensions.
restab1[paste0("p_", 2005:2008)] <- restab1[paste0("SR_", 2005:2008)] / restab1[paste0("xnull_SR_", 2005:2008)]

Index-Matching and assigning the matched value to a variable

I'm totally new with R and I'd like to perform some simple indexing through R.
I have a data frame with names on the first column and corresponding unique IDs on the second. I'd like to assign a specific ID to a particular variable and use it onward for data analysis. For Example:
names <- c('Kyle','Sophie','John','Peter','Julie','Carol')
IDs <- c('23513','15315','62352','25346','73424','03029')
df <- data.frame(names, IDs)
I've got a data frame like this, and want to assign a particular ID to a variable like:
Student_ID <- (sample formula to bring in an ID using a name, say "Kyle" and this formula
brings in '23513')
I'm extremely new to the coding environment so I don't even know if this is possible.
Thanks!
We can use match to get the index of student_name in names column of data and get the corresponding ID back.
student_name <- "Kyle"
Student_ID <- df$IDs[match(student_name, df$names)]
Student_ID
#[1] 23513
#Levels: 03029 15315 23513 25346 62352 73424

Concatenating strings to get object name

a <- "name"
df$a
Here, df is my data frame, and name is one of the column names of data frame df. How could I command R to execute code by considering (a) to be an object name instead of a character?
Do the following. First remove a column in which you want to work. After that, turn it into your desired object.
Example:
A = factor (a)
A = vector (a)
1 - You can only concatenate vectors.
2- A letter "a" is not sensitive using the name of an object. Use another name, for example: Work1

Assigning name to rows in R

I would like to assign names to rows in R but so far I have only found ways to assign names to columns. My data is in two columns where the first column (geo) is assigned with the name of the specific location I'm investigating and the second column (skada) is the observed value at that specific location. To clarify, I want to be able to assign names for every location instead of just having them all in one .txt file so that the data is easier to work with. Anyone with more experience than me that knows how to handle this in R?
First you need to import the data to your global environment. Try the function read.table()
To name rows, try
(assuming your data.frame is named df):
rownames(df) <- df[, "geo"]
df <- df[, -1]
Well, your question is not that clear...
I assume you are trying to create a data.frame with named rows. If you look at the data.frame help you can see the parameter row.names description
NULL or a single integer or character string specifying a column to be used as row names, or a character or integer vector giving the row names for the data frame.
which means you can manually specify the row names when you create the data.frame or the column containing the names. The former can be achived as follows
d = data.frame(x=rnorm(10), # 10 random data normally distributed
y=rnorm(10), # 10 random data normally distributed
row.names=letters[1:10] # take the first 10 letters and use them as row header
)
while the latter is
d = data.frame(x=rnorm(10), # 10 random data normally distributed
y=rnorm(10), # 10 random data normally distributed
r=letters[1:10], # take the first 10 letters
row.names=3 # the column with the row headers is the 3rd
)
If you are reading the data from a file I will assume you are using the command read.table. Many of its parameters are the same of data.frame, in particular you will find that the row.headers parameter works the same way:
a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.
Finally, if you have already read the data.frame and you want to change the row names, Pierre's answer is your solution

How to avoid reading data from a dataframe when the passed column name do not match exactly?

I recently discovered that R will output data for a column name if the column name does not exist as is passed but the dataframe has a column name that meets what was passed as column name to retrieve data.
So if you have a dataframe X with column names say fruits and vegetables and if you try to retrieve data as X$fruit it will give you the fruits column data even when the passed column name (fruit) does not match the data frame column name (fruits). It throws error if there are column names like fruitss because at this time I believe R cannot decide whether to show fruits or fruitss to the passed value of x$fruit
How to avoid this?
The $ can create confusion where there are similar prefix for column names, so it is better to use [[ or [ to extract the columns as it will match the entire string and not any partial strings.
X[["fruit"]]
Or
X[, "fruit"]

Resources