Trying to create a count variable in R - r

I'm attempting to create a counter variable in R which loops through the n rows of my 442 column dataframe and increases the counter by 1 on every 55th row.
I've tried the following code:
dataset$num=ceiling(row(dataset)/55)
which works fine, however R duplicates the function for every column in my dataframe rather than simply creating a single new column containing the counter variable. So I have 442 copies of the same variable titled num.1, num.2, ..., num.442.
What am I doing wrong? Thanks!

It sounds like you just want something like:
rep(1:1000,each=55,length.out=nrow(dataset))
The 1000 here could be anything as long as it's larger than nrow(dataset)/55.

Related

Using FOR LOOP over Multiple Columns of MATRIC and keeping FIRST column constant in RStudio

I am running the Automatic Variance Ratio (AVR) test on my dataset in R. My Dataset Contains 6 Indices i.e. columns exculing the date column. In this test, I need to use FOR LOOP which would constantly roll over the first column i.e. Date column, and keep changing/moving from the 2nd till the 6th column. I am new to R, therefore, I don't know exactly what to do and how to do it. Currently, I have a code that can run this for only the 2nd column but from the 2nd column onwards it can loop over. All of you are requested to please help me in this regard.
A standard way to loop through the columns of a dataframe is with lapply. If your dataframe is df with 7 columns and you want to loop through columns 2 through 7 and your function is Av.VR() then
output_list <- lapply(df[,2:7], function(x) Av.VR(x))
should yield a list of outputs for each column.
Note I have no experience using the function Av.VR().

How do I add a column with a different number of rows?

I am calculating returns in R, and trying to add it to the current dataframe I am working with, but it doesnt work due to a difference in rows, where as existing rows are 194, and assigned data has 193 rows.
This code works just fine when doing it on its own:
diff(log(capm$price_Ford))
But when I try to assign it into the dataframe as its own column, I get the an error
capm$ford_ret <- diff(log(capm$price_Ford))
How can I assign the data with 193 rows, to a dataframe with 194 rows?
How can I assign the data with 193 rows, to a dataframe with 194 rows?
In a nutshell, you can’t. Each column in a table must have the same number of rows. You need to decide what to fill into the row that’s missing a value. Depending on your use-case, this might for example be 0 or NA. You also need to decide whether the missing value should go at the beginning or at the end (for a difference, usually at the beginning). For example:
capm$ford_ret <- c(NA, diff(log(capm$price_Ford)))

How to add a record to a dataframe using rbind in R

I am trying to add rows to a dataframe using rbind in R. However, the dataframe is not being updated each time I attempt to add a row. In other words, the following code results in a dataframe with 2 columns but 0 observations when it should have 2 columns with 2 observations.
modeldata2<-data.frame(Model=character(),Accuracy=numeric())
modelname<-"A"
accuracystr2<-2.2
rbind(modeldata2,list(modelname,accuracystr2))
modelname<-"B"
accuracystr2<-3.2
rbind(modeldata2,list(modelname,accuracystr2))
I am using this at the end of a loop to record values and therefore need to first initialize an empty dataframe and then add records to the dataframe at the end of each loop. The code above is just an example that I am using to troubleshoot the problem. I have also tried using c instead of list but the result was the same.

Update a data frame within a for loop

The point of this question is that I want to know how to update a dataframe inside of either a for loop or a function. So i know there are other ways to do the specific task i am looking at, but i want to know how to do it the way i am trying to do it.
I have a data frame with 15 columns and 2k observations with some 98 and 99s. For each row in where there is a 98 or 99 for any variable/column, I want to remove the whole row. I create a function to filter by variable name not equal to 98/99, and use lapply. however, instead of continually updating the data frame, It just spits out a series of data frames, overwriting the previous data frame, meaning that at the end i will only get a data frame with the last column cleaned. How do i get it to update the data frame for each column sequentially?
nafunction = function(variable){
kuwait5=kuwait5%>%
filter(variable<90)
}
`nafunction = function(variable){
kuwait5=kuwait5%>%
filter(variable<90)
}
lapply(kuwait5, nafunction)`
Expected result is a new data frame with all rows that have an 98 removed. What i get is a sequence of data frames each one having ONE column in which rows with NAS are removed.

Repeat every number in a column a specific amount of times using R code

I would like to create a column of numbers using R code. I want the numbers one through five to repeat 200 times in a row each, and then the number 6 to repeat 125 times in a row. How can I code this? I tried just coding
New <- c(1:6,each=200)
hoping it would just stop when it had filled all 1125 available columns. But I just get an error message instead.
Thanks for your help!

Resources