Aggregate in R based on variable condition - r

I want to create a new variable called REF_YEARCPI that aggregates the CPIs for all 12 months within the year. In the table, there is a variable called REF_MONTHCPI but I need to transform this variable into an annual variable (called REF_YEARCPI) that aggregates 12 of the CPI values within the year. In the image, I have 2 columns: REF_MONTHCPI stores the monthly reference periods and CPI_RESTAURANT which stores the CPI for the month.

I don't know the name of the dataframe you have so I will assume it as df.
df$REF_YEARCPI <- df$REF_MONTHCPI * 12
You can replace df in the above code with the name of your dataframe.

Related

function in R for setting data as panel, similar to xtset in Stata

Is there a function in R similar in "xtset id year" in Stata that will convert the dataset from 'wide' to 'long' based on a list of variables with the year appended at the end of each variable, for each year of the panels in the data set?
Update: The dataset is set up as follows: (column headers)
State_abbrv Program_22 ParticipantCount_22 Program_21
ParticipantCount_21 Program_20 ParticipantCount20
The Program variables represent a categorical variable with the name of the program and the ParticipantCount variables represent the numeric count of participants in a program, for that year. The three program years 20-22 are appended to the variable names. In Stata, the Xtset command will take the year directly from the years appended to each variable and create a row for each observation for each year.

R Load both variables and values from table

My goal is to administer both variables and their values in a spreadsheet.
Basically I want to be able to add the new values for a new year in a new column and load them into R.
I then want to assign the variables named in the first column with the corresponding value in either one of the second or third column.
Input spreadsheet:
Variable
Year2013
Year2018
age
12
17
pets
c(cat,dog,elephant)
c(dog,mouse)
cars
cars$name
cars$name
Desired Output:
For year 2013
import("dataspreadsheet.csv")
derived from this -->
age <- 12
pets <- c(cat,dog,elephant)
cars <- cars$name
Is there any way to tell R to make this assignment?

creating datafram with 5000 row using the 500 rows reference file

my reference file has 500 rows and 11 columns (years: 2007:2017) with date in those column as value.
i have to creat a dummy dataframe of 5000 rows and 11 column (years: 2007:2017). i want to put the radom date within one month from a reference file. i think following function would creat a random date within a month.
reviewdate$x2007_1 <- as.Date(reviewdate$X2007, format = "%d/%m/%Y") + sample((-15:15), 1)
i need to creat a for loop so i can run selected column in my datafram so i can creat random date for 2007 to 2017 period.
my second question is about my reference file has only 500 records for each year and want to creat 5000 records for each year? how can i generate radom date for 5000 rows for each year using reference file which 500 rows for each year?

Include variable value on data frame name

I'm trying to figure out how can I add something to a data frame df, based on a variable (i.e. a date), ending up with a data frame named df_17 if variable is equal to 2017 for example.
The reason why I want this is because I'm importing datasets from several years and quarters, and I would like to make sure that they are named according to the year variable they have. Each dataset only has 1 date. I know I can do it manually but it would take me less time to automate it.
I know how to do it with columns and rows, but I can't figure it out for objects.
EDIT:
Example 1:
Data frame name "df"
A B Date
1 4 2017
2 3 2017
New data frame name "df_2017"
Example 2:
Data frame name "df"
A B Date
1 4 2016
2 3 2016
New data frame name - "df_2016 "
The assign function should do what you want. A solution could look like
assign(paste0("df_", year), dataframe_read_from_file, pos = 1)
If you use assign inside a function oder a loop, make sure that you set the pos option correctly.

Create a stack of n subset data frames from a single data frame based on date column

I need to create a bunch of subset data frames out of a single big df, based on a date column (e.g. - "Aug 2015" in month-Year format). It should be something similar to the subset function, except that the count of subset dfs to be formed should change dynamically depending upon the available values on date column
All the subsets data frames need to have similar structure, such that the date column value will be one and same for each and every subset df.
Suppose, If my big df currently has last 10 months of data, I need 10 subset data frames now, and 11 dfs if i run the same command next month (with 11 months of base data).
I have tried something like below. but after each iteration, the subset subdf_i is getting overwritten. Thus, I am getting only one subset df atlast, which is having the last value of month column in it.
I thought that would be created as 45 subset dfs like subdf_1, subdf_2,... and subdf_45 for all the 45 unique values of month column correspondingly.
uniqmnth <- unique(df$mnth)
for (i in 1:length(uniqmnth)){
subdf_i <- subset(df, mnth == uniqmnth[i])
i==i+1
}
I hope there should be some option in the subset function or any looping might do. I am a beginner in R, not sure how to arrive at this.
I think the perfect solution for this might be use of assign() for the iterating variable i, to get appended in the names of each of the 45 subsets. Thanks for the note from my friend. Here is the solution to avoid the subset data frame being overwritten each run of the loop.
uniqmnth <- unique(df$mnth)
for (i in 1:length(uniqmnth)){
assign(paste("subdf_",i,sep=""), subset(df, mnth == uniqmnth[i])) i==i+1
}

Resources