Estimating the regression to the mean in R - r

I am working with a set of 42345 patients, from those, I establish subsets as Obese/non-Obese etc
I am wondering if there is a way in R to estimate the regression to the mean.
This may be a good resource in case someone who knows base R can help me. From this paper, I would be interested in expressing this equation in R:
Thank you very much in advance.
Best regards

Related

How To Perform Alternate Shapley Calculations in R (Shapley-Lorenz or Mean Shapley Values)

The article below mentions Shapley-Lorenz as an alternate computation to regular Shapley values for XAI. For implementing Shapley-Lorenz, so far the only code I found was something started in Python here --> https://github.com/roye10/ShapleyLorenz. Has anyone found code in R that calculates Shapley-Lorenz? I know R has a Shapley function but not sure how to adapt it to Shapley-Lorenz.
Alternatively, if Shapley-Lorenz is a reach for R and not practical to implement, has anyone tried mean Shapley values in R? I assume that might be an easier path but am not sure how to adapt it.
Both above are twists I am considering for a class assignment requirement, so any guidance is deeply appreciated. Thanks!
Giudici, P., & Raffinetti, E. (2021). Shapley-Lorenz eXplainable artificial intelligence. Expert Systems with Applications, 167, 114104.

How do I find the exact equation for a Caret model in R?

I used Caret to create a regression model of a dataset in R, and I wish to find this equation for usage in other websites (e.g. Desmos). I am unable to find info anywhere on how to do this, so if anyone has answers, that would be much appreciated! :D

How to run Longitudinal Ordinal Logistic Regression in R

I'm working with a large data set with repeated patients over multiple months with ordered outcomes on a severity scale from 1 to 5. I was able to analyze the first set of patients using the polr function to run a basic ordinal logistic regression model, but now want to analyze association across all the time points using a longitudinal ordinal logistic model. I can't seem to find any clear documentation online or on this site so far explaining which package to use and how to use it. I am also an R novice so any simple explanations would be incredibly useful. Based on some initial searching it seems like the mixor function might be what I need though I am not sure how it works. I found it on this site
https://cran.r-project.org/web/packages/mixor/vignettes/mixor.pdf
Would appreciate a simple explanation of how to use this function if this is the right one, or would happily take any alternate suggestions with an explanation.
Thank you in advance for your help!

R - replicate weight survey

Currently I'm interested in learning how to obtain information from the American Community Survey PUMS files. I have read some of the the ACS documentation and found that to replicate weights I must use the following formula:
And thanks to google I also found that there's the SURVEY package and the svrepdesign function to help me get this done
https://www.rdocumentation.org/packages/survey/versions/3.33-2/topics/svrepdesign
Now, even though I'm getting into R and learning statistics and have a SQL background, there are two BIG problems:
1 - I have no idea what that formula means and I would really like to understand it before going any further
2 - I don't understand how the SVREPDESIGN function works nor how to use it.
I'm not looking for someone to solve my life/problems, but I would really appreciate if someone points me in the right direction and gives a jump start.
Thank you for your time.
When you are using svyrepdesign, you are specifying that it is a design with replicated weights, and it uses the formula you provided to calculate the standard errors.
The American Community Survey has 80 replicate weights, so it first calculates the statistic you are interested in with the full sample weights (X), then it calculates the same statistic with all 80 replicate weights (X_r).
You should read this: https://usa.ipums.org/usa/repwt.shtml

Non-negative linear regression with xgboost

This is my first question here so I'm sorry if it's not properly asked.
I'm playing around with the xgboost function in R and I was wondering if there is a simple parameter I could change so my linear regression objective=reg:linear has the restriction of only non-negative coefficients? I know I can use nnls for non-negative least squares regression, but I would prefer some stepwise solution like xgboost is offering.
If there is no easy way but a complicated one I would be happy to hear that, too. I read there is an option to build custom objective functions. So maybe you could change the reg:linear function at some point to get the non-negativity?
Thank you very much for your advice in advance!

Resources