Automatically scaling numbers in a table using stargazer - r

I wish to create a number of tables with numerical elements of very different scales. For example, tables with the variance of various variables down the diagonal, and the correlations off the diagonal. The larger numbers make the tables too big, and are harder to read.
Is there a good way using stargazer (or some other, similar package) to scale down the elements that are much larger, and indicate this with a foot note, or automatically use exponential notation?
For example, the following r code creates a matrix where the diagonal elements are much larger than the other elements.
x <- matrix(rnorm(25,0,1),5,5)
diag(x) <- rnorm(5,10000000,10)
stargazer(x,summary=F,digits=2)
Any help much appreciated.

Maybe you can do some adaptation of this:
ifelse(x < 100, sprintf("%0.2f", x), sprintf("%0.5e", x))
# [,1] [,2] [,3] [,4] [,5]
#[1,] "9.99999e+06" "-0.79" "-0.56" "0.91" "-2.57"
#[2,] "-0.13" "9.99999e+06" "-1.83" "-0.34" "1.73"
#[3,] "-0.48" "0.38" "1.00000e+07" "1.40" "-0.32"
#[4,] "-0.05" "-0.62" "0.91" "1.00000e+07" "1.15"
#[5,] "-0.09" "-0.33" "-0.16" "0.35" "9.99999e+06"
Or, without quotes:
noquote(ifelse(x < 100, sprintf("%0.2f", x), sprintf("%0.5e", x)))
# [,1] [,2] [,3] [,4] [,5]
#[1,] 9.99999e+06 -0.79 -0.56 0.91 -2.57
#[2,] -0.13 9.99999e+06 -1.83 -0.34 1.73
#[3,] -0.48 0.38 1.00000e+07 1.40 -0.32
#[4,] -0.05 -0.62 0.91 1.00000e+07 1.15
#[5,] -0.09 -0.33 -0.16 0.35 9.99999e+06
You are essentially converting to text for printing in your desired way. For more info on output options, see ?sprintf.

Related

How to repeat a Function and store values in R using the function sim.msm

I would like to simulate 10000 result for the function below and store the values.It is a function available on the package msm (R-software).
sim.msm(qmatrix,15)
Result:
$states
[1] 1 2 3 2 3 2 2
$times
[1] 0.000000 1.538988 2.240587 9.695302 11.002184 14.998754 15.000000
$qmatrix
[,1] [,2] [,3]
[1,] -0.11 0.10 0.01
[2,] 0.05 -0.15 0.10
[3,] 0.02 0.07 -0.09
This is only one simulation . I need 10000 like this.
Grateful if someone could help me
Replicate allows to repeat N times the same command. Here N = 10 :
replicate(10, sim.msm(qmatrix,15), simplify = FALSE)

Unexpected output containing plus, minus, and letters produced by subtracting one column of numbers from another in R

I have a data.frame containing a vector of numeric values (prcp_log).
waterdate PRCP prcp_log
<date> <dbl> <dbl>
1 2007-10-01 0 0
2 2007-10-02 0.02 0.0198
3 2007-10-03 0.31 0.270
4 2007-10-04 1.8 1.03
5 2007-10-05 0.03 0.0296
6 2007-10-06 0.19 0.174
I then pass this data through Christiano-Fitzgerald band pass filter using the following command from the mfilter package.
library(mFilter)
US1ORLA0076_cffilter <- cffilter(US1ORLA0076$prcp_log,pl=180,pu=365,root=FALSE,drift=FALSE,
type=c("asymmetric"),
nfix=NULL,theta=1)
Which creates an S3 object containing, among other things, and vector of "trend" values and a vector of "cycle" values, like so:
head(US1ORLA0076_cffilter$trend)
[,1]
[1,] 0.05439408
[2,] 0.07275321
[3,] 0.32150292
[4,] 1.07958965
[5,] 0.07799329
[6,] 0.22082246
head(US1ORLA0076_cffilter$cycle)
[,1]
[1,] -0.05439408
[2,] -0.05295058
[3,] -0.05147578
[4,] -0.04997023
[5,] -0.04843449
[6,] -0.04686915
Plotted:
plot(US1ORLA0076_cffilter)
I then apply the following mathematical operation in attempt to remove the trend and seasonal components from the original numeric vector:
US1ORLA0076$decomp <- ((US1ORLA0076$prcp_log - US1ORLA0076_cffilter$trend) - US1ORLA0076_cffilter$cycle)
Which creates an output of values which includes unexpected elements such as dashes and letters.
head(US1ORLA0076$decomp)
[,1]
[1,] 0.000000e+00
[2,] 0.000000e+00
[3,] 1.387779e-17
[4,] -2.775558e-17
[5,] 0.000000e+00
[6,] 6.938894e-18
What has happened here? What do these additional characters signify? How can perform this mathematical operation and achieve the desired output of simply $log_prcp minus both the $tend and $cycle values?
I am happy to provide any additional info that will help right away, just ask.

Average pairwise correlations of an increasing number of variables (R)

I have a matrix called "variables" which includes 9 variables (9 columns). I have obtained the pairwise correlation matrix with this code:
matrix.cor <- cor(variables, method="kendall", use="pairwise")
Now I want to obtain the average pairwise correlation as a function of the number of variables considered. I mean, The average of all possible correlation of 2 variables, 3 variables, 4 variables... up to the 9 variables in order to see the effect of adding variables. I have this R code (extracted from an article which more factors and columns) but it does not run well, I only obtain the average considering the 9 variables.
pairwisecor.df = ddply(data,c("Exp"),function(x) {
Smax = unique(x$Rich)
x = x[,variables]
cormat = cor(t(x),use="complete.obs",method=c("kendall"))
data.frame(
Smax = Smax,
no.fn = nrow(x),
avg.cor = mean(cormat[lower.tri(cormat)]) ) } )
I think it couldn't be very difficult to create a function to analyze a cumulative number of variables... but I only have the reference of an article where the data is much more complicated.
Any idea?
Here is a fictitious example on calculating mean values among the increasing size of lower triangle matrices, starting from left upper corner:
> (cormat <- matrix((1:25)/25, 5, 5))
[,1] [,2] [,3] [,4] [,5]
[1,] 0.04 0.24 0.44 0.64 0.84
[2,] 0.08 0.28 0.48 0.68 0.88
[3,] 0.12 0.32 0.52 0.72 0.92
[4,] 0.16 0.36 0.56 0.76 0.96
[5,] 0.20 0.40 0.60 0.80 1.00
> avg.cor = c()
> for (i in 2:dim(cormat)[1]) {
+ avg.cor=cbind(avg.cor,mean(cormat[1:i,1:i][lower.tri(cormat[1:i,1:i])]))
+ }
> avg.cor
[,1] [,2] [,3] [,4]
[1,] 0.08 0.1733333 0.2666667 0.36

Inter-scale correlation matrix wide format to long format (in R)

In one of my datafiles, correlation matrices are stored in long format, where the first three columns represent the variables and the last three columns represent the inter-scale correlations. Frustratingly, I have to admit, the rows may represent different sub-dimensions of particular construct (e.g., the 0s in column 1).
The datafile (which is used for a meta-analysis) was constructed by a PhD-student who "decomposed" all relevant correlation matrices by hand (so the wide- format matrix was not generated by another piece of code).
Example
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 4 1 -0.32 0.25 -0.08
[2,] 0 4 2 -0.32 0.15 -0.04
[3,] 0 4 3 -0.32 -0.04 0.27
[4,] 0 4 4 -0.32 -0.14 -0.16
[5,] 0 4 1 -0.01 0.33 -0.08
[6,] 0 4 2 -0.01 0.36 -0.04
[7,] 0 4 3 -0.01 0.04 0.27
[8,] 0 4 4 -0.01 -0.03 -0.16
My question is, how to restore the interscale-correlation matrix. Such that,
c1_0a c1_0b c2_4 c3_1 c3_2 c3_3 c3_4
c1_0a
c1_0b
c2_4 -0.32 -0.01
c3_1 0.25 0.33 -0.08
c3_2 0.15 0.36 -0.04
c3_3 -0.04 0.04 0.27
c3_4 -0.14 -0.03 -0.16
I suppose this can be done with the reshape2 package, but I am unsure how to proceed. Your help is greatly appreciated!
What I have tried so far (rather clumpsy):
I identified the unique combinations of column 1,2, and 4 which I transposed to correlation matrix #1;
Similarily, I identified the unique combinations of column 1,3, and 5 which I transposed to correlation matrix #2;
Similarily, I identified the unique combinations of column 2,3, and 6 which I transposed to correlation matrix #3;
Next, I binded the three matrices which gives the incorrect solution.
The problem here is that matrix #1 has different dimensions [as there are two different correlations reported for the relationship between the first construct (0) and the second construct (4)] than matrix #3 [as there are eight different correlations reported for the second construct (4) and the third construct (1-4)].
I tried both the meltand reshape2packages to overcome these problems (and to come up with a more elegant solution), but I did not find any clues about how to set up functions in these packages to reconstruct the correlation matrix.

How to solve linear equations in R with rectangular matrix

please i'm trying to solve a 7x2 matrix problem in the form below using R- software:
A=array(c(5.54,0.96,1.59,2.07,0.73,10.64,8.28,1.41,3.77,3.11,3.74,2.93,8.29,3.33), c(7,2))
A
# [,1] [,2]
#[1,] 5.54 1.41
#[2,] 0.96 3.77
#[3,] 1.59 3.11
#[4,] 2.07 3.74
#[5,] 0.73 2.93
#[6,] 10.64 8.29
#[7,] 8.28 3.33
b=c(80814.25,34334.75,47921.75,59514.25,26981.25,63010.25,46646.25)
b
#[1] 80814.25 34334.75 47921.75 59514.25 26981.25 63010.25 46646.25
solve (A,b)
Error in solve.default(A, b) : 'a' (7 x 2) must be square
A %*% solve (A,b)
Error in solve.default(A, b) : 'a' (7 x 2) must be square
What do you think I can do to solve the problem. I need solution to two variables, x1 and x2, in the 7x2 matrix as stated above.
It seems that you're using solve when it needs a square input. In ?solve it discusses how you can use qr.solve for non-square matrices.
qr.solve(A,b)
[,1]
[1,] 3741.208
[2,] 6552.174
You might want to check that this is correct for your purposes. There are other ways to solve these types of problems. This might help you though.
The corpcor package offers a pseudoinverse function for finding the inverse of a rectangular matrix:
library(corpcor)
pseudoinverse(A)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.06271597 -0.05067830 -0.02922597 -0.03265713 -0.03964039 0.0230086
[2,] -0.05845856 0.08551514 0.05661287 0.06532450 0.06674243 0.0391552
[,7]
[1,] 0.07239133
[2,] -0.05420334
pseudoinverse(A) %*% b
[,1]
[1,] 3741.208
[2,] 6552.174

Resources