How to do a fancy box-plot using ggplot2? [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
I am trying to plot a box-plot with ggplot2 using the Wage database in the ISLR package. The box-plot is meant to visualize the Wage versus educational level, which is presented in five categories. When I try to use the typical code to generated the box-plot a get the following warning from Rstudio:
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (3000): y
My code is
library("ISLR")
library("MASS")
setwd("C:/Users/Alonso/Desktop/ITSL")
View(Wage)
ggplot(Wage, aes(x=education, y=Wage))+
geom_boxplot(outlier.colour="red", outlier.shape=8, outlier.size=4)+labs(x="Nivel de estudio", y="Salario")
I have made other graphics but just with numeric variables, maybe the problem is that now I am using a categorical variable. Any ideas?, thanks in advance and greetings from Chile.

You were almost there, just needed a lowercase y=wage because the column name is wage and not Wage.
ggplot(Wage, aes(x=education, y=wage))+
+ geom_boxplot(outlier.colour="red", outlier.shape=8, outlier.size=4)+labs(x="Nivel de estudio", y="Salario")

Related

Unable to modify ggplot2 x-axis labels [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I have a a dataset such as follows:
CON <- data_frame(norm.d.2=c(1.37,1.11,1.84),CDSex.=c(0.439,0.335,0.432))
I am plotting this data frame with ggplot2, but I am unable to change the x axis labels. I have tried both scale_x_continuous and scale_x_discrete, but either I receive an error or no labels at all.
ggplot(CONrc,aes(x=norm.d.2,y=CDSex.)) +
geom_point(aes(color=factor(interaction(hpi,rep)))) +
xlab('Exonic % of Cellular Reads') +
ylab('CDS % of Exonic Reads')
When I try scale_x_continuous(breaks=c(0.5,1.5,2.5)), I get the following error:
Error: Discrete value supplied to continuous scale
When I try scale_x_discrete(breaks=c(0.5,1.5,2.5)), I do not get an error message, but my plot loses all x axis labels.
looks fine for CON table.
try to mutate CONrc
CONrc$norm.d.2 <- as.double(CONrc$norm.d.2)
and then plot this

Why are these states not displaying the data when the data exists with the usmap package? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm trying to plot election results on the US map with the usmap package but even though the dataset is complete, I get plot that shows missing values for some states. The states are greyed out and I'm not sure why this is happening..
plot_usmap(data=data_total,values='percent_biden')+
scale_fill_continuous(low='red',high='blue',name='Percent for Biden')+
theme(legend.position='right')+
ggtitle(paste("Total Popular Vote of Final Results"))
You are incorrectly assuming that usmap will infer any format for state names. For instance, both of these produce a working map,
usmap::plot_usmap(data=data.frame(state=c("alabama","new york"),s=c(5,15)), values="s")
usmap::plot_usmap(data=data.frame(state=c("AL","NY"),s=c(5,15)), values="s")
whereas inferring from your pic of data, you are trying
usmap::plot_usmap(data=data.frame(state=c("alabama","new-york"),s=c(5,15)), values="s")
# ^ dash, not space
So I believe you need to clean up your data and fix your state names.

R glm regression not including several dummy variables [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I have a data set (acs_hh) in which one of the columns is race_eth.
For the following regression:
reg <- glm(acs_hh$own ~ acs_hh$hhincome + acs_hh$race_eth, family = "binomial")
summary(q7reg)
However, in my data there exist more than just the four races mentioned in the summary; asian is also a race in my dataset.
Why is R not calculating a coefficient for asians, i.e acs_hh$race_ethasian, non-hisp ?
When using dummy variables one of the categories is excluded and serves as the reference category to which all the others are compared. So to calculate fitted values for Asian, non-hisp you would set all of the other categories to 0.
Because "asian" is the reference level of acs_hh$race_eth -- all the other coefficients represent the effect relative to the reference level (which in your case, I suspect is "asian" because that is the alphabetically first level).

How to choose the first "n" elements in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
(USING R)
So I imported a data set by using
xcars <- read.csv(file.choose())
and then I chose my data set which was originally an excel file.
So, I have a column named dist (short for displacement) and I want to choose the first 25 entries underneath that column and then plot it on a histogram, so I attempted the following.
carsUpTo25 <- xcars(1:25,)
hist(carsUpTo25$dist)
Of course this didn't work. However, any help on how I would do this would be helpful.
Try this-
hist(xcars[,dist[1:10]])

Code a new variable for a dataset using multiple logical operators in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I am trying to convert SAS script to R to learn R. Below is the script in SAS.
if continent=1 and (country=5 or country=10) then rate = 8
Here are my attempts for the dataframe called data:
data$rate[(continent==1) & (country==5 | country==10)] <- 8
Or:
data$rate[(continent==1) & country %in% c(5,10)] <-8
Unfortunately, the attempts do not generate the result correctly. The result shows the rate 8 when either continent=1 or country=5 or country=10. I guess I am wrong on combining logical operators in R.
Could anyone help me fix the issue? Many thanks!
Note: I used attach(data) above since I am lazy to rewrite data again.
It looks like your issue is that you did not call the variable from the data from the dataframe data. This may be the cause of your error. To fix this, specify the variable within the dataframe by doing:
data$rate[(data$continent==1 & (data$country==5 | data$country==10))] <- 8

Resources