Add asterisk on top of the bar - graph

Suppose that I have the following code to make these bar graphs. I am using Stata 16. m
foreach z in "lgbt" "gender" "nac" "minority"{
preserve
gen time_trial_0_`z' = .
replace time_trial_0_`z' = time_trial_`z' if `z'_ev ==0
gen time_trial_1_`z' = .
replace time_trial_1_`z' = time_trial_`z' if `z'_ev ==1
graph bar (mean) time_trial_0_`z' time_trial_1_`z', over(orden_seleccion) ytitle(Average time on trial) b1title(Trial) legend(label(1 "No `z'") label(2 "`z'")) name(`z', replace) addlabels
restore
}
graph combine lgbt gender nac minority, ycommon xcommon imargin(0 0 0 0) graphregion(margin(l=22 r=22)) iscale(0.5) title(Time reviewing applications) name(All_time, replace)
So this gets me this:
So each bar represents the average time on a trial for two groups (minority | no minority).
How can I put an asterisk to demonstrate if this difference between means is statistically significant?? (maybe use a t-test?)
By the way, how can I put some data as an example? please simulate I don't know how or if I can put the data (because is from work)

Related

What is the best way to give the same x and y data for each aes mapping for ggplot in R

I am trying to graph baseball data in R and am having trouble finding the best way to do this.
I have a CSV file of the data, taken from here and am trying to plot certain statistics on a scatter plot.
For example, something I was just graphing was this. Everything looks fine, but I want to make changing the statistics I am graphing easier.
Currently I have a bunch of geom_image functions for every team for every team, as seen below:
geom_image(data=subset(team_stats, Team == "Blue Jays" | Team == "TOR"), mapping = aes(x = HR.x, y = wRC.,
image = imageBlueJays), size = .1)+
geom_image(data=subset(team_stats, Team == "Red Sox" | Team == "BOS"), mapping = aes(x = HR.x, y = wRC.,
image = imageRedSox), size = .055)+
I have 28 more of these lines for the rest of the teams in the league. Initially I was thinking of doing a for loop, but I have no idea how I'd set the value for the "image" variable for each team.
So right now if I want to change a statistic, I have to do a search and replace for the stat I want to switch out. It'd be much simpler if I just had variables for the xstat and ystat and was able to just change those once, however I am not sure about how to do that.
If I set one of the variables inside aes() equal to something like team_stats$HR.x, gives me an error saying:
"Error in `check_aesthetics()`:
! Aesthetics must be either length 1 or the same as the data (1): x
Run `rlang::last_error()` to see where the error occurred."
I believe this is due to the fact that the length of team_stats$HR.x is just 30, which is the number rows (one row for each team). While the length of the dataframe is equal to 677, which is the number of columns.
What can I do here?

Black Scholes surface vs Time to Maturity and Strike Price in R

I've just found out the "wireframe" function in R to plot 3D-surface graphs.
I wish to implement it by plotting the Black&Scholes Call Option price against two sequence of data: Time to Maturity and Strike Price. So, first of all here follows my script so far:
S=100 #My stock Price
K=90 #Initial Strike Price
T=1 #Initial Time to Maturity (1 year here)
RF=0.03 #Risk free rate
SIGMA=0.2 #Volatility
d1=(log(S/K) + (RF + 0.5*SIGMA^2)*T)/SIGMA*sqrt(T) #initial d(1)
d2=d1-SIGMA*sqrt(T) #initial d(2)
Then I tried to prepare a grid for my surface/3D plot:
K=seq(80,120,1)
T=seq(0,1,0.1)
table=expand.grid(K,T)
Last step, I add a new column variable for computing my Call price according to every single combination:
table$CALL= S*pnorm(d1) - K*exp(-RF*T)*pnorm(d2)
names(table)= c("K","T","CALL")
Finally the surface/3D plot:
wireframe(CALL ~ K * T, scales = list( arrows = FALSE),aspect = c(1, .6),data=table,
drape=T,shade=T)
So, it plots an apparently reliable graph (according to my finance study) but...I don't know, it looks a bit "scale-step" graph. As I'm a newbye in "wireframe" function, I don't know if I properly used all input data. I'd love an opinion to someone who already used to plot B&S formula in a 3D plot. I'm interested because I'd do the same to plot Greeks and Implied Volatility in the future.
Thanks in advance

ggplot2 to generate a geom_bar()

I have the following table in R Studio, and I am trying to create a geom_bar() with ggplot2 to represent the percentage of students that received financial assistance and those who did not.
comparison_table <- data.frame(Students1, Percentage1, financial_assistance1, stringsAsFactors = F)
comparison_table
Students1 Percentage1 financial_assistance1
1 PELL 0.4059046 True
2 NOPELL 0.5018954 False
3 LOAN 0.4053371 True
4 NOLOAN 0.2290538 False
My code for the bar plot is:
PELL<-mean(na.omit(completion_rate_based_on_financial_assistance$percent_of_students_with_Pell_Grant_and_completed_in_4_years))
NOPELL<-mean(na.omit(completion_rate_based_on_financial_assistance$percent_of_students_without_Pell_Grant_and_completed_in_4_years))
LOAN<-mean(na.omit(completion_rate_based_on_financial_assistance$percent_of_students_with_federal_loan_and_completed_in_4_years))
NOLOAN<-mean(na.omit(completion_rate_based_on_financial_assistance$percent_of_students_without_federal_loan_and_completed_in_4_years))
tab1<-cbind(PELL,LOAN)
tab2<-cbind(NOPELL,NOLOAN)
tab<-rbind(tab1,tab2)
rownames(tab) <- c("PELL","LOAN")
colnames(tab) <- c("With Financial Help","Without Financial Help")
barplot(tab,beside = F,legend.text= rownames(tab),xlab = "Financial Help",col=c("lightblue","pink"))
My question is, how can I generate this bar plot using ggplot2 and geom_bar(). For visualization purposes, I wish to generate two stacked bars, one that contains the percentage of students that received Pell Grants and Loans (PELL & LOAN) and other bar that contains the percentage of students that did not received Pell Grants and Loans (NOPELL & NOLOAN).
tb<-data.frame(students1=c("PELL","NOPELL","LOAN","NOLOAN"), percentage1=c(40,50,40,23), financial_assistance1=c(TRUE,FALSE, TRUE, FALSE))
g<-ggplot(tb,aes(financial_assistance1,percentage1))
g+geom_bar(stat="identity",aes(fill=students1))
Explanation: It's pretty simple - create the ggplot with the x (financial_assistance) and y (percentage) variables, and create the geom_bar. Only thing to remember about the geom_bar is that it defaults to counting how many cases of "x" and showing that as the bar height. In this case, you want to use the y variable as the value, so that's the stat="identity" bit. The aes(fill=students1) is there to add the two colours for the stacked bars.
UPDATE: Just noticed I misread what you tried to achieve, edited the code to correct for it.

How to put 2 boxplot in one graph in R without additional libraries?

I have this kind of dataset
Defect.found Treatment Program
1 Testing Counter
1 Testing Correlation
0 Inspection Counter
3 Testing Correlation
2 Inspection Counter
I would like to create two boxplotes, one boxplot of detected defects per program and one boxplot of detected defects per technique but in one graph.
Meaning having:
boxplot(exp$Defect.found ~ exp$Treatment)
boxplot(exp$Defect.found ~ exp$Program)
In a joined graph.
Searching on Stackoverflow I was able to create it but with lattice library typing:
bwplot(exp$Treatment + exp$Program ~ exp$Defects.detected)
but i would like to know if its possible to create the graph without additional libraries like ggplot and lattice
Prepare the plot window to receive two plots in one row and two columns (default is obviously one row and one column):
par(mfrow = c(1, 2))
My suggestion is to avoid using the word exp, because it is already used for the exponential function. Use for instance mydata.
Defects found against treatment (frame = F suppresses the external box):
with(mydata, plot(Defect.found ~ Treatment, frame = F))
Defects found against program (ylab = NA suppresses the y label because it is already shown in the previous plot):
with(mydata, plot(Defect.found ~ Program, frame = F, ylab = NA))

Lattice Histogram multiplot with different layouts

I wanted to take a closer look at the distribution of RT-times on questions. To do so, I used lattice to make histrograms and depict them in one figure. I used the following settings:
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
main=list(
label="RT distribution per subject",
cex=1.5),
xlab=list(
label="RT (s)",
cex=0.75),
ylab=list(
label="Percentage occurence",
cex=1.2),
xlim=c(0,40),
breaks = 10
)
In other words, I want the participants' data to be plotted on an x-axis from 0 to 40 seconds, divided into 10 bars. This is done for some sub-plot, but for many they use a different breaks. I added the figure. Why does the function not use the same layout for every sub-plot?
I found a solution to the problem. Instead of specifying the amount of break, you can specify an array with breaks as follows:
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
xlim=c(0,40),
breaks = c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55)
)
or, more simply,
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
xlim=c(0,40),
breaks = seq(from=0,to=55,by=1)
)
Note, however, that the range must include every data point. For more see C-Ran's page about Lattice's histrogram.

Resources