how I can evaluate a single arm treatment without control group? - r
I am going to Evaluation of combined surgical and antibiotic treatment for Diabetic foot ulcers, 30 patients with Diabetic foot ulcers were enrolled in this study, and the date of first and last visit was recorded (treatment duration time in weeks were calculted), I considered this study as single-arm treatment as there I had no control group. I recorded the CRP before and after the treatment, the patients with an absolute difference in CRP less than 10 were considered as healing otherwise no healing will be recorded. How I can start with R cran evaluating my treatment. Statistics approach, and methodology
Thanks in advance.
My data
crp_before = c(96.1,90.4,114.4,88.3,76.1,191.2,69.8,122.3,188.6,77.3,126.8,189.3,165.2,116.8,72.3,120.9,122.3,115.2,90,142.3,87.2,195.5,184.3,110.2,113.6,147.4,96.8,116.4,55.3,209)
crp_after = c(5.3,7,6.2,3.5,4.2,9.6,5.2,5.3,9.6,8,7.6,11,10.3,4.6,3.2,8.6,7.5,8.4,6.3,7.6,6.8,112,6.3,8.5,9.2,5.3,4.1,7.6,3,100)
time_week = c(9,8,12,8,4,24,4,8,24,4,12,24,20,12,5,12,13,12,8,16,8,24,24,8,8,16,8,12,3,4)
Related
Time series forecasting of outcome variable based on current performance of outcome variable in R
I have a very large dataset (~55,000 datapoints) for chicken crops. Chickens are grown over ~35 day period. The dataset covers 10 sheds of ~20,000 chickens each. In the sheds are weighing platforms and as chickens step on them they send the weight recorded to a server. They are sending continuously from day 0 to the final day. The variables I have are: House (as a number, House 1 up to House 10), Weight (measured in grams, to 5 decimal points) and Day (measured as a number between two integers, e.g. 12 noon on day 0 might be 0.5 in the day, whereas day 23.3 suggests a third of the way through day 23 (8AM). But as this data is sent continuously the numbers can be very precise). I want to construct either a Time Series Regression model or an ML model so that if I take a new crop, as data is sent by the sensors, the model can make a prediction for what the end weight will be. Then as that crop cycle finishes it can be added to the training data and repeat. Currently I'm using this very simple Weight VS Time model, but eventually would include things like temperature, water and food consumption, humidity etc. I've run regression analyses on the data sets to determine the relationship between time and weight (it's likely quadratic, see image attached) and tried using randomForrest in R to create a model. The test model seemed to work well in regards to the MAPE value being similar to the training value, but that was by taking out one house and using that as the test. Potentially what I've tried so far is completely the wrong methodology but this is a new area so I'm really not sure of the best approach.
Is It Appropriate to Conduct Interrupted Time Series (ITS) Analysis or Repeated-Measures Panel Analysis When Intervention Start Dates Vary?
I am attempting to estimate the causal effect of intervention receipt (i.e., enrollment in a case management program) on a set of count outcomes (i.e., monthly visits to the doctor). Individuals enroll in the case management program at different points in time (e.g., an individual can enroll in the program anytime between 01/2017 and 01/2022). I have count data on the number of monthly doctor visits for each client for each of the 24 months prior to program enrollment and the 24 months following program enrollment. I want to estimate whether the number of doctor visits decreases following enrollment in the case management program. Most of the interrupted time series (ITS) research for count data (e.g., negative binomial count models using tscount in R) I have come across uses population-level interventions which occur at one discrete time-point (e.g., July 1, 2018) instead of individual-level interventions which occur at varying time-points (e.g., one client enrolls on July 1, 2018; another client enrolls on January 1, 2019). I would appreciate any guidance on how to explore this question going forward (e.g., is an ITS design where intervention start dates vary across individuals even appropriate analytically or would some version of a repeated-measures panel approach with an intervention dummy be more appropriate)? Thanks!
GLM for overdispersed count data, negative residual trends
I have been trying to analyze count data of shark detections and how it has changed throughout different periods of time over several years. I have y=number of detections for an event, x=covid_period, afactorial with three levels (before, during, after) as well as diel period (day/night),sex(male or female), year(to see if the covid period has a difference in detections in general in other years where there wasn't a lockdown). Since all my response variables are categorical, I have been trying to run a glm with family=quasipoisson. glm_quasi<-glm(num_detections~covid_periodyearSEXdiel_periodanimal_id, family=quasipoisson, data=nursesharks) My residuals/qqnorm plots indicate this is not a good model. In essence I want to know if there were more or less detections of female sharks during the day during the the covid_period of 2020. Am I choosing the right model?
GAMM4 smoothing spline for time variable
I am constructing a GAMM model (for the first time) to compare longitudinal slopes of cognitive performance in a Bipolar Disorder (BD) sample, compared to a control (HC) sample. The study design is referred to as an "accelerated longitudinal study" where participants across a large span of ages 25-60, are followed for 2 years (HC group) and 4 years (BD group). Hypothesis (1) The BD group’s yearly rate of change on processing speed will be higher overall than the healthy control group, suggesting a more rapid cognitive decline in BD than seen in HC. Here is my R code formula, which I think is a bit off: RUN2 <- gamm4(BACS_SC_R ~ group + s(VISITMONTH, bs = "cc") + s(VISITMONTH, bs = "cc", by=group), random=~(1|SUBNUM), data=Df, REML = TRUE) The visitmonth variable is coded as "months from first visit." Visit 1 would equal 0, and the following visits (3 per year) are coded as months elapsed from visit 1. Is a cyclic smooth correct in this case? I plan on adding additional variables (i.e peripheral inflammation) to the model to predict individual slopes of cognitive trajectories in BD. If you have any other suggestions, it would be greatly appreciated. Thank you!
If VISITMONTH is over years (i.e. for a BD observation we would have VISITMONTH in {0, 1, 2, ..., 48} (for the four years)), then no, you don't want a cyclic smooth unless there is some 4-year periodicity that would mean 0 and 11 should be constrained to be the same. The default thin plate spline bs = 'tp' should suffice. I'm also assuming that there are many possible values for VISITMONTH as not everyone was followed up at the same monthly intervals? Otherwise you're not going to have many degrees of freedom available for the temporal smooth. Is group coded as an ordered factor here? If so that's great; the by smooth will encode the difference between the reference level (be sure to set HC as the reference level) and the other level so you can see directly in the summary a test for a difference of the BD group. It's not clear how you are dealing with the fact that HC are followed up over fewer months than the BD group. It looks like the model has VISITMONTH representing the full time of the study not just a winthin-year term. So how do you intend to compare the BD group with the HC group for the 2 years where the HC group are not observed?
How do I code a Mixed effects model for abalone growth in Aquaculture nutrition with nested individuals
I am a biologist working in aquaculture nutrition research and until recently I haven't paid much attention to the power of statistics. The usual method of analysis had been to run ANOVA on final weights of animals given various treatments and boom, you have a result. I have tried to improve my results by designing an experiment that could track individuals growth over time but I am having a really hard time trying to understand which model to use for the data I have. For simplified explanation of my experiment: I have 900 abalone/snails which were sourced from a single cohort (spawned/born at the same time). I have individually marked each abalone (id) and recorded a length and weight at Time 0. The animals were then randomly assigned 1 of 6 treatment diets (n=30 abalone per treatment) each replicated n=5 times (n=150 abalone / replicate). Each replicate looks like a randomized block design where each treatment is only replicate once within each block and each is assigned to independent tank with n=30 abalone/tank (n treatment). Abalone were fed a known amount of feed for 90 days before being weighed and measured again (Time 1). They are back in their homes for another 90 days before the concluding the experiment. From my understanding: fixed effects - Time, Treatment nested random effects - replicate, id My raw data entered is in Long format with each row being a unique animal and columns for Time (0 or 1), Replicate (1-5), Treatment (1-6), Sex (M or F) Animal ID (1-900), Length (mm), Weight (g), Condition Factor (Weight/Length^2.99*5655) I have used columns from my raw data and converted them to factors and vectors before using the new variables to create a data frame. id<-as.factor(data.long[,5]) time<-as.factor(data.long[,1]) replicate<-as.factor(data.long[,2]) treatment<-data.long[,3] weight<-as.vector(data.long[,7]) length<-as.vector(data.long[,6]) cf<-as.vector(data.long[,10]) My data frame is currently in the following structure: df1<-data.frame(time,replicate,treatment,id,weight,length,cf) I am struggling to understand how to nest my individual abalone within replicates. I can convert the weight data to change from initial but I think the package nlme already accounts this change when coded correctly. I could also create another measure of Specific Growth Rate for each animal at Time 1 but this would not allow the Time factor to be used. lme(weight ~ time*treatment, random=~1 | id, method="ML", data=df1)) I would like to structure a mixed effects model so that my code takes into account the individual animal variability to detect statistical differences in their weight at Time 1 between treatments.