I want to count frequencies of specific rows by age groups, the steps are:
1. the data frame of "pud", when the column of "icd3" meet the following conditions
2. Select the qualified rows and count the frequencies.
The codes are as follows:
u2<-which(pud$icd3>="A00"&pud$icd3<="A99"|
pud$icd3>="B00"&pud$icd3<="B94"|
pud$icd3=="B99")
u3<-which(pud$icd3>="A00"&pud$icd3<="A99"|
pud$icd3>="B00"&pud$icd3<="B49"|
pud$icd3>="B90"&pud$icd3<="B94"|
pud$icd3=="B99")
for (i in 2:3){co[i]=addmargins(table(pud[u[i],]$agegroups))}
but the output panel reprents:
for (i in 2:3){co[i]=addmargins(table(pud[u[i],]$agegroups))}
Error in [.data.frame(pud, u[i], ) : object 'u' not found
How can I adjust the codes?
If you want the frequencies, why not do it like this?
sum(pud$icd3>="A00"&pud$icd3<="A99"|
pud$icd3>="B00"&pud$icd3<="B94"|
pud$icd3=="B99")
Related
I have a dataset in R titled NIS_data. I am trying to count the number of missing values for each variable using the following code:
attach(NIS_data)
lapply(colnames(NIS_data), function(var) {sum(is.na(var))})
My reasoning behind this code is that colnames(NIS_data) returns a vector; my function will then sum up the number of missing values for each variable. However, R returns a list of length 9 where each element in the list has a value of 1. I know that this is not correct (there is not 1 missing observation per variable). Any help is appreciated.
I am still new to R and I am attempting to solve a seemingly simple problem. I would like to identify all of the unique combinations of values from 4 different rows, and update an additional column in my df to annotate whether or not it is unique.
Giving a df with columns A-Z, I have used the following code to identify unique combinations of column A,B,C,D, and E. I am trying to update column F with this information.
unique(df[ ,c("A", "B","C","D", "E")])
This returns each of the individual rows with unique combinations as expected, but I cannot figure out what the next step I should take in order to update column "F" with a value to indicate that it is a unique row. Thanks in advance for any pointers!
I have a data frame containing tree ring chronologies. The names of the lines are important because they are my years. I would like to obtain the rownames (years) for which a chronology (column) meets conditions of the type value below a threshold for at least a certain length of time. I tried the rle function which is interesting to check if my conditions are checked but not to know when the condition occurs.
df= my data frame
min_sup= ring width threshold
min_yr= the minimum time my ring width is below the min_value threshold
I tried several things like for example:
x=row.names(df)[with(rle(df[,1]< min_sup),lengths >= min_yr)]
Thank you for your help.
I would like to create a new column in my dataframe that assigns a categorical value based on a condition to the other observations.
In detail, I have a column that contains timestamps for all observations. The columns are ordered ascending according to the timestamp.
Now, I'd like to calculate the difference between each consecutive timestamp and if it exceeds a certain threshold the factor should be increased by 1 (see Desired Output).
Desired Output
I tried solved it with a for loop, however that takes a lot of time because the dataset is huge.
After searching for a bit I found this approach and tried to adapt it: R - How can I check if a value in a row is different from the value in the previous row?
ind <- with(df, c(TRUE, timestamp[-1L] > (timestamp[-length(timestamp)]-7200)))
However, I can not make it work for my dataset.
Thanks for your help!
I have a dataframe with multiple columns and I want to apply different functions on each column.
An example of my dataset -
I want to calculate the count of column pq110a for each country mentioned in qcountry2 column(me-mexico,br-brazil,ar-argentina). The problem I face here is that I have to use filter on these columns for example for sample patients I want-
Count of pq110 when the values are 1 and 2 (for some patients)
Count of pq110 when the value is 3 (for another patients)
Similarly when the value is 6.
For total patient I want-total count of pq110.
Output I am expecting is-Output
Similalry for each country I want this output.
Please suggest how can I do this for other columns also,countrywise.
Thanks !!
I guess what you want to do is count the number of columns of 'pq110' which have the same value within different 'qcountry2'.
So I'll try to use 'tapply' to divide data into several subsets and then use 'table' to count column number for each different value.
tapply(my_data[,"pq110"], INDEX = as.factor(my_data[,"qcountry2"]), function(x)table(x))