I tried to generate a line chart with 3 lines by years but i can only generate 1 line with my code, what should I do
Try this:
Without seeing your data it is guess work on my part.
I suspect YEAR is being treated as a continuous variable and to get distinct colours you need YEAR to be a discrete variable.
ggplot(data = crimessum2)+
geom_line(mapping = aes(x=HOUR, y = Numbers, col = factor(YEAR), group = YEAR))+
xlab("HOUR")+
ylab("Total Paid by Insurance in $$")+
ggtitle(" ")
Related
I have a dataset that has monthly averages of interest rate, years, and then I created a dummy variable to indicate the years begore 2016 (which will be the 0) and after that (which is the 1 in the dummy variable). I want to make a plot of the interest rate in time having one separate line according to the dummy variable (one for the years before 2016 and one after it). My code is:
p <- ggplot(data = dataset_new,
mapping = aes(x = month(Dates, label = TRUE),
y = int_rate))+
geom_point()+
geom_line(aes(group = factor(dummy),
color = factor(dummy)))
p + theme(legend.background = element_rect(fill="lightblue",
size=0.5, linetype="solid"))
I would like to do two things next:
change the title of the legend from dummy to case study and
change the categories of the legend. What I mean is that now it writes 1 and 0, but I want to write (2017-2020) for the first and (2013-2016) for the second one.
Any help would be appreciated. Thanks in advance!
To change the name:
+ labs(fill = 'Case Study')
To change the categories, I'd do it in the data:
dataset_new$case_study <- ifelse(dataset_new$dummy == 1, '(2017-2020)', '(2013-2016)')
And then in your ggplot call replace any instances of dummy with case_study.
I am trying to get a boxplot with 3 different tools in each dataset size like the one below:
ggplot(data1, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
But I need to transform x-axis to log scale. For that, I need to numericize each dataset to be able to transform them to log scale. Even without transforming them, they look like the one below:
ggplot(data2, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
I checked boxplot parameters and grouping parameters of aes, but could not resolve my problem. At first, I thought this problem is caused by scaling to log, but removing those elements did not resolve the problem.
What am I missing exactly? Thanks...
Files are in this link. "data2" is the numericized version of "data1".
Your question was a tough cookie, but I learned something new from it!
Just using group = dataset is not sufficient because you also have the tool variable to look out for. After digging around a bit, I found this post which made use of the interaction() function.
This is the trick that was missing. You want to use group because you are not using a factor for the x values, but you need to include tool in the separation of your data (hence using interaction() which will compute the possible crosses between the 2 variables).
# This is for pretty-printing the axis labels
my_labs <- function(x){
paste0(x/1000, "k")
}
levs <- unique(data2$dataset)
ggplot(data2, aes(x = dataset, y = time, color = tool,
group = interaction(dataset, tool))) +
geom_boxplot() + labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_x_log10(breaks = levs, labels = my_labs) + # define a log scale with your axis ticks
scale_y_log10() + theme_bw()
This plots
I have a time-series, with each point having a time, a value and a group he's part of. I am trying to plot it with time on x axis and value on y axes with the line appearing a different color depending on the group.
I tried using geom_path and geom_line, but they end up linking points to points within groups. I found out that when I use a continuous variable for the groups, I have a normal line; however when I use a factor or a categorical variable, I have the link problem.
Here is a reproducible example that is what I would like:
df = data.frame(time = c(1,2,3,4,5,6,7,8,9,10), value = c(5,4,9,3,8,2,5,8,7,1), group = c(1,2,2,2,1,1,2,2,2,2))
ggplot(df, aes(time, value, color = group)) + geom_line()
And here is a reproducible example that is what I have:
df = data.frame(time = c(1,2,3,4,5,6,7,8,9,10), value = c(5,4,9,3,8,2,5,8,7,1), group = c("apple","pear","pear","pear","apple","apple","pear","pear","pear","pear"))
ggplot(df, aes(time, value, color = group)) + geom_line()
So the first example works well, but 1/ it adds a few lines to change the legend to have the labels I want, 2/ out of curiosity I would like to know if I missed something.
Is there any option in ggplot I could use to have the behavior I expect, or is it an internal constraint?
As pointed by Richard Telford and Carles Sans Fuentes, adding group = 1 within the ggplot aesthetic makes the job. So the normal code should be:
ggplot(df, aes(time, value, color = group, group = 1)) + geom_line()
I have a Date column and Value column. I did my research on internet and tried every possible thing but it does not shows my the trend line graph. I am totally confused what is happening in my data. I have shared my code below:
ggplot(data = New, aes(x = OrderDate, y = TotalAmountWithGST))+
geom_line(color = "#00AFBB", size = 2) + scale_x_date(date_labels = "%b/%Y")
ggplot(x, aes(x = OrderDate, y = TotalAmountWithGST)) +
geom_line()+
theme_minimal()
I am trying to plot a line graph that shows a monthly trend but somehow I am getting a graph that is similar to bar graph but its not a line graph.
You need to add a geom_smooth to your ggplot code.
It's hard to replicate a working example without sample data but that should get you on the right path.
Provided the following dataframe (see below) which was taken out of a questionnaire asking about perceived security to people from different neighborhoods, I have managed to create a bar plot which displays perceived security and groups results per each neighborhood:
questionnaire_raw = read.csv("https://www.dropbox.com/s/l647q2omffnwyrg/local.data.csv?dl=0")
ggplot(data = questionnaire_raw,
aes(x = factor(Seguridad.de.tu.barrio..de.día.), # We have to convert x values to categorical data
y = (..count..)/sum(..count..)*100,
fill = neighborhoods)) +
geom_bar(position="dodge") +
ggtitle("Seguridad de día") +
labs(x="Grado de seguridad", y="% encuestados", fill="Barrios")
I would like to overlay these results with a line graph representing the mean of each security category (1, 2, 3 or 4) in all neighborhoods (this is, without grouping results), so it is easy to know if a specific neighborhood is over or under the average of all neighborhoods. However, since it's my first job with R, I do not know how to calculate that mean with a dataframe and then overlay it in the previous barplot.
using data.table for data-manipulation and lukeA's comment:
require(ggplot2)
require(data.table)
setDT(questionnaire_raw)
setnames(questionnaire_raw, c("Timestamp", "Barrios", "Grado"))
plot_data <- questionnaire_raw[,.N, by=.(Barrios,Grado)]
ggplot(plot_data, aes(x=factor(Grado), y = N, fill = Barrios)) +
geom_bar(position="dodge", stat="identity") +
stat_summary(fun.y=mean, geom = "line", mapping = aes(group = 1)) +
ggtitle("Seguridad de día") +
labs(x="Grado de seguridad", y="% encuestados", fill="Barrios")
Result: