how to make multiple highchart graph in R? - r

I'm trying to graph multiple dataframe columns in R.
(like this-> Graphing multiple variables in R)
bid ask date
1 20.12 20.14 2014-10-31
2 20.09 20.12 2014-11-03
3 20.03 20.06 2014-11-04
4 19.86 19.89 2014-11-05
This is my data.
And I can make one line graph like this.
`data%>% select(bid,ask,date) %>% hchart(type='line', hcaes(x='date', y='bid'))`
I want to add ask line graph in this graph.

One way is to reshape (gather) the values to plot and then add a group aesthetic to the hchart function:
library(tidyr)
data %>% select(bid,ask,date) %>%
gather("key", "value", bid, ask) %>%
hchart(type='line', hcaes(x='date', y='value', group='key'))
ps. Don't forget to load all the necessary libraries

You can use the following code
library(reshape2)
library(highcharter)
df_m <- melt(df, id="date")
hchart(df_m, "line", hcaes(x = date, y = value, group = variable))
Here is the data
df = structure(list(bid = c(20.12, 20.09, 20.03, 19.86), ask = c(20.14,
20.12, 20.06, 19.89), date = structure(c(4L, 1L, 2L, 3L), .Label = c("03/11/2014",
"04/11/2014", "05/11/2014", "31/10/2014"), class = "factor")), class = "data.frame", row.names = c(NA,
-4L))

Related

Making a Stacked Bar Chart Out of Table Columns in R

I'm trying to create a stacked bar graph showing body composition. I have a table/data set (I don't know the correct term) that looks like this:
structure(list(data.Date = structure(1:7, .Label = c("2021-03-06",
"2021-03-07", "2021-03-08", "2021-03-09", "2021-03-10", "2021-03-11",
"2021-03-12"), class = "factor"), total_bf = c(19.6612, 18.2182,
19.6803, 21.7047, 18.126, 19.7, 19.1424), total_muscle = c(41.5948,
43.043, 42.1578, 42.1866, 43.4017, 42.2, 42.2728), other = c(37.544,
38.8388, 38.0619, 38.0087, 39.1723, 38.1, 38.2848)), class = "data.frame", row.names = c(NA,
-7L))
Each column is a weight in kilograms. Together they add up to the total body weight of the subject. What I want is a stacked bar graph where each bar represents a date and each bar is split by total_bf, total_muscle and other. All of the guides and Q&As I've seen don't seem to apply to my situation. Maybe this is because I am new but nothing I've tried has worked yet.
An example of what I'm trying to achieve:
The only difference is that on my graph blue would be body fat (total_bf), green would be other and red would be muscle (total_muscle).
You can convert data from the wide format to the long format using tidyr::pivot_longer() function:
library(ggplot2)
df <- structure(list(
data.Date = structure(
1:7,
.Label = c("2021-03-06", "2021-03-07", "2021-03-08", "2021-03-09",
"2021-03-10", "2021-03-11", "2021-03-12"), class = "factor"),
total_bf = c(19.6612, 18.2182, 19.6803, 21.7047, 18.126, 19.7, 19.1424),
total_muscle = c(41.5948, 43.043, 42.1578, 42.1866, 43.4017, 42.2, 42.2728),
other = c(37.544, 38.8388, 38.0619, 38.0087, 39.1723, 38.1, 38.2848)
), class = "data.frame", row.names = c(NA, -7L))
long <- tidyr::pivot_longer(df, -data.Date)
Then using ggplot2, the defaults already make a stacked bar chart, so you just need to specify x, y and fill aesthetics.
ggplot(long, aes(data.Date, value, fill = name)) +
geom_col()
Since your date is encoded as a factor, if you want to encode it as a real date you can convert it as follows:
long$date <- as.Date(strptime(as.character(long$data.Date), format = "%Y-%m-%d"))
ggplot(long, aes(date, value, fill = name)) +
geom_col()
Created on 2021-03-12 by the reprex package (v0.3.0)

Bar graph for binary variable across two groups?

I have the following data (sample below):
Participant Group Choice
1 Control 0
2 Control 0
3 Control 0
4 Stress 1
5 Stress 1
6 Stress 1
I want to create a bar graph depicting the frequencies of Choice (0 or 1) for Group (Stress VS Control).
Make a table and use barplot which comes with R.
barplot(with(dat, table(Choice, Group)), main="My plot", beside=T, col=2:3)
Data:
(Forgive me that I chose slightly more interesting data :)
dat <- structure(list(Participant = 1:6, Group = c("Control", "Control",
"Control", "Stress", "Stress", "Stress"), Choice = c(0L, 1L,
0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, -6L
))
You can use count to count the frequencies, convert the variables to factor and plot.
library(dplyr)
library(ggplot2)
df %>%
count(Group, Choice) %>%
mutate(Choice = factor(Choice), Group = factor(Group)) %>%
ggplot() + aes(Group, n, fill = Choice) + geom_col()

Make two geom_bar() plots base on different columns in one plot

I have a data frame that looks like this:
Year Women Men
1 2013 145169 889190
2 2014 119064 849778
3 2015 210107 1079592
4 2016 221217 1427639
5 2017 205000 1692592
6 2018 273721 1703456
7 2019 434407 2010493
I want to make a geom_bar, where x is a year and every year has two bars for a number from Women and Men. I have found a solution where this table should looks different, but I'm wondering if there is an option to work with this one. Thank You for any help :)
You can use the following code
library(tidyverse)
df %>%
pivot_longer(cols = -c(Year,Sl), values_to = "Value", names_to = "Name") %>%
ggplot(aes(x = Year, y = Value, fill = Name))+geom_col(position = "dodge")
Data
df = structure(list(Sl = 1:7, Year = 2013:2019, Women = c(145169L,
119064L, 210107L, 221217L, 205000L, 273721L, 434407L), Men = c(889190L,
849778L, 1079592L, 1427639L, 1692592L, 1703456L, 2010493L)), class = "data.frame", row.names = c(NA,
-7L))

Multi-line Time Series Chart in ggplot2

I have a dataframe comprising two columns, 'host', and 'date'; which describes a series of cyber attacks against a number of different servers on specific dates over a seven month period.
Here's what the data looks like,
> china_atks %>% head(100)
host date
1 groucho-oregon 2013-03-03
2 groucho-oregon 2013-03-03
...
46 groucho-singapore 2013-03-03
48 groucho-singapore 2013-03-04
...
Where 'groucho-oregon', 'groucho-signapore', etc., is the hostname of the server targeted by an attack.
There are around 190,000 records, spanning 03/03/2013 to 08/09/2013, e.g.
> unique(china_atks$date)
[1] "2013-03-03" "2013-03-04" "2013-03-05" "2013-03-06" "2013-03-07"
"2013-03-08" "2013-03-09"
[8] "2013-03-10" "2013-03-11" "2013-03-12" "2013-03-13" "2013-03-14"
"2013-03-15" "2013-03-16"
[15] "2013-03-17" "2013-03-18" "2013-03-19" "2013-03-20" "2013-03-21"
"2013-03-22" "2013-03-23"
...
I'd like to create a multi-line time series chart that visualises how many attacks each individual server received each day over the range of dates, but I can't figure out how to pass the data to ggplot to achieve this. There are nine unique hostnames, and so the chart would show nine lines.
Thanks!
Here's one way to do this.
First Summarize the count frequency by date.
library(plyr)
df <- plyr::count(da,c("host", "date"))
Then Do the plotting.
ggplot(data=df, aes(x=date, y=freq, group=1)) +
geom_line(aes(color = host))
Data
da <- structure(list(host = structure(1:4, .Label = c("groucho-eu",
"groucho-oregon", "groucho-singapore", "groucho-tokyo"), class = "factor"),
date = structure(c(1L, 1L, 1L, 1L), .Label = "2013-03-03", class = "factor"),
freq = c(1L, 4L, 2L, 1L)), .Names = c("host", "date", "freq"
), row.names = c(NA, -4L), class = "data.frame")
ggplot2 library is capable of performing statistics. Hence, an option could be to let ggplot handle count/frequency. This should draw multiple lines (one for each group)
ggplot(df, aes(x=Date, colour = host, group = host)) + geom_line(stat = "count")
Note: Make sure host is converted to factor to have discrete color for lines.

R: plot multiple curves vs one var but for 4 factors

I have a DF that looks like:
id app vac dac
1: 1 1000802 579 455
2: 1 1000803 1284 918
3: 1 1000807 68 66
4: 1 1000809 1470 903
5: 2 1000802 407 188
6: 2 1000803 365 364
7: 2 1000807 938 116
8: 2 1000809 699 570
I need to plot vac and dac for each app on same canvas as a function of id. I know how to do it for only one app by using melt and bulk-plot with ggplot. But I'm stuck how to do it for arbitrary number of factors/levels.
In this example there will be total 8 curves for 4 app. Any thoughts?
Here's the data frame for tests. Thank you!!
df = structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), app = c(1000802,
1000803, 1000807, 1000809, 1000802, 1000803, 1000807, 1000809
), vac = c(579, 1284, 68, 1470, 407, 365, 938, 699), dac = c(455,
918, 66, 903, 188, 364, 116, 570)), .Names = c("id", "app", "vac",
"dac"), class = c("data.table", "data.frame"), row.names = c(NA,
-8L))
Edit: some clarification on axes,
x axis = id, y axis = values of vac and dac for each of 4 app factors
It is a bit unclear what you are looking for, but if you are looking for a line connecting the values of vac and dac, here is a solution using dplyr and tidyr.
First, gather the vac and dac columns (this is similar to reshape2::melt but with a syntax I find easier to follow). Then, set the variable (which has "vac" and "dac") as your x-locations, the value (from the old vac and dac columns) as your y and then map app and id to aesthetics (here, color and linetype). Set the group to ensure that it connects the right pairs of points, and add geom_line:
df %>%
gather(variable, value, vac, dac) %>%
ggplot(aes(x = variable
, y = value
, color = factor(app)
, linetype = factor(id)
, group = paste(app, id))) +
geom_line()
gives
Given the question edit, you can change axes like so:
df %>%
gather(variable, value, vac, dac) %>%
ggplot(aes(x = id
, y = value
, color = factor(app)
, linetype = variable
, group = paste(app, variable))) +
geom_line()
gives
I not sure, I understood your question but I would do something like
ggplot(df,aes(vac,app,group=app)) + geom_point(aes(color=factor(app)))

Resources