BeautifulSoup getting attribute value not working - web-scraping

I have code to extract an attribute value but it is not getting all of them only if i specifically go into one. I am confused as to what is incorrect.
from bs4 import BeautifulSoup
import requests
url = 'https://seekingalpha.com/market-news'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
news = soup.find(name="ul", attrs={'class':"item-list",'id':'latest-news-list'})
root = 'https://seekingalpha.com'
#id_list = [i['id'] for i in news.find_all(name='li', attrs={'class':'item'})]
heading_list = [i.text for i in news.find_all('a')]
#url_list = [root+i['href'] for i in news.find_all('a')]
datehtml = news.find_all(name='li', attrs={'class':'item'})
date_list = [news.find_all(name='li', attrs={'class':'item'})[i] for i in range(len(datehtml))]
#date_list2 = [i.split()[0] for i in date_list]
#id_list2 = [i.split('-')[2] for i in id_list]
date_list[1]['data-last-date'] # works
[i['data-last-date'] for i in date_list] #error

Your code is a bit messy, that's for starters. Other than that, why don't you just grab all li tags with class item and extract the attribute value?
For example:
import requests
from bs4 import BeautifulSoup
url = 'https://seekingalpha.com/market-news'
soup = BeautifulSoup(requests.get(url).text, 'html.parser').find_all("li", {"class", "item"})
print([l.get("data-last-date") for l in soup])
Output:
['2020-11-10 11:43:36 -0500', '2020-11-10 11:38:12 -0500', None, '2020-11-10 11:37:37 -0500', '2020-11-10 11:35:56 -0500', '2020-11-10 11:34:53 -0500', '2020-11-10 11:16:19 -0500', '2020-11-10 11:13:32 -0500', '2020-11-10 11:12:55 -0500', '2020-11-10 11:04:01 -0500', '2020-11-10 11:00:25 -0500', '2020-11-10 10:52:45 -0500', '2020-11-10 10:51:07 -0500', '2020-11-10 10:49:43 -0500', '2020-11-10 10:45:27 -0500', '2020-11-10 10:45:01 -0500', '2020-11-10 10:40:28 -0500', '2020-11-10 10:35:38 -0500', '2020-11-10 10:29:40 -0500', '2020-11-10 10:28:35 -0500', '2020-11-10 10:25:52 -0500', '2020-11-10 10:25:31 -0500', '2020-11-10 10:22:57 -0500', '2020-11-10 10:20:58 -0500', '2020-11-10 10:18:00 -0500', '2020-11-10 10:15:18 -0500', '2020-11-10 10:11:18 -0500', '2020-11-10 10:04:29 -0500', '2020-11-10 10:03:10 -0500', '2020-11-10 10:02:43 -0500', '2020-11-10 10:01:56 -0500', '2020-11-10 10:01:42 -0500', '2020-11-10 10:00:48 -0500', '2020-11-10 09:58:09 -0500', '2020-11-10 09:57:17 -0500', '2020-11-10 09:56:11 -0500', '2020-11-10 09:55:39 -0500', '2020-11-10 09:54:42 -0500', '2020-11-10 09:50:00 -0500', '2020-11-10 09:47:51 -0500', '2020-11-10 09:47:29 -0500', '2020-11-10 09:47:13 -0500', '2020-11-10 09:46:59 -0500', '2020-11-10 09:46:12 -0500', '2020-11-10 09:40:10 -0500', '2020-11-10 09:37:15 -0500', '2020-11-10 09:37:00 -0500', '2020-11-10 09:35:46 -0500', '2020-11-10 09:35:46 -0500', '2020-11-10 09:34:25 -0500', '2020-11-10 09:31:43 -0500', '2020-11-10 09:29:51 -0500', '2020-11-10 09:27:39 -0500', '2020-11-10 09:27:24 -0500', '2020-11-10 09:21:19 -0500', '2020-11-10 09:19:07 -0500', '2020-11-10 09:18:40 -0500', '2020-11-10 09:17:57 -0500', '2020-11-10 09:17:53 -0500', '2020-11-10 09:16:24 -0500', '2020-11-10 09:15:46 -0500']

Related

Drm function for dose response curve

I am trying to make a dose-receptive curve (i.e a titration curve). The data is
structure(list(Dilution = c(300L, 900L, 2700L, 8100L, 24300L,
72900L, 218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L,
300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L, 300L, 900L,
2700L, 8100L, 24300L, 72900L, 218700L, 300L, 900L, 2700L, 8100L,
24300L, 72900L, 218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L,
218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L),
X..bound = c(92.43, 92.95, 92.26, 86.55, 67.49, 21.86, 0.72,
89.57, 87.84, 82.35, 65.84, 24.18, 3.56, 0.32, 91.63, 90.57,
87.22, 77.03, 39.52, 5.39, 1.24, 93.51, 93.56, 90.33, 80.49,
38.97, 4.7, 0.93, 95.37, 94.44, 91.24, 77.74, 28.76, 2.14,
0.15, 0.01, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0),
Sample = c("CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA",
"CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA",
"CoV77-39 1mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA",
"CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA",
"CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG",
"CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG",
"CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 HA",
"CoV77-39 HA", "CoV77-39 HA", "CoV77-39 HA", "CoV77-39 HA",
"CoV77-39 HA", "CoV77-39 HA", "CoV77-39 WT", "CoV77-39 WT",
"CoV77-39 WT", "CoV77-39 WT", "CoV77-39 WT", "CoV77-39 WT",
"CoV77-39 WT")), class = "data.frame", row.names = c(NA,
-49L))
I then run try<-drm(X..bound~Dilution,data=Titration.8.31,Sample,robust="mean",fct=LL.4()) and then this generates a curve after I run
plot(try,col=c("dodgerblue2", "#E31A1C", "green4", "#6A3D9A", "#FF7F00", "black", "gold1", "skyblue2", "palegreen2", "#FDBF6F", "gray70", "maroon", "orchid1", "darkturquoise", "darkorange4", "brown"),lty=c(1,1,1,1,1,5,5))
which looks great because the lines of best fit don't go below 0. However, I am not sure how to change the numerical axis; more specifically the x-axis. I am trying to make the x axis label with 10^.x---I know how to do this in ggplot by using
+scale_x_continuous(trans = "log10",breaks = trans_breaks("log10", function(x) 10^x),labels = trans_format("log10", math_format(10^.x)), minor_breaks = 10^(seq(0, 7, by = 0.25)))
but this won't work with the plot function. Is there a way to make my x-axis label a 10^x# instead of the odd labeling/spacing it has now in the attached image? Or is there a way to do this in ggplot (preferred)?

.drc plot and ggplot function

I am trying to plot a graph with ggplot. Currently, I am only able to plot with the plot function in R, not ggplot for my .drc results. I want to use ggplot since I already have nice line of code for it and ggplot is more customizable than the plot function in R. my line of code with the .drc fuynction is:
try<-drm(X..bound~Dilution,data=Titration.8.31,Sample,robust="mean",fct=LL.4())
which generates my .drc data. I can then plot this using the plot function which I don't really want to do since I can't really change labels or anything. in ggplot my line of pre-existing code with loess lines of best fit which I want to remove since they drop below zero and replace with my .drc code is :
ggplot(Titration.8.31, aes(x = Dilution, y = `X..bound`)) +
geom_point(size=5,aes(color=Sample,shape=Sample)) +
scale_shape_manual(values=c(0,2,5,8,13,15,16,17,18,19,20,10,9,3)) +
scale_x_continuous(trans = "log10",breaks = trans_breaks("log10", function(x) 10^x),labels = trans_format("log10", math_format(10^.x)), minor_breaks = 10^(seq(0, 7, by = 0.25))) +
labs(x="Antibody Dilution",y="% Cell Binding") +
theme_minimal() +
theme(axis.title.x=element_text(size=22)) +
theme(axis.title.y=element_text(size=22)) +
theme(axis.text=element_text(size=18)) +
scale_color_manual(values=c("dodgerblue2", "#E31A1C", "green4",
"#6A3D9A", "#FF7F00", "black", "gold1", "skyblue2", "palegreen2", "#FDBF6F", "gray70", "maroon", "orchid1", "darkturquoise", "darkorange4", "brown")) +
coord_cartesian(ylim=c(0,100)) +
theme(legend.key.size =unit(1,"in")) +
theme(legend.text=element_text(size=11))`
How do I change this line of code so that my .drc lines can be the new lines of best fit? If I can't use ggplot, how do I change the x axis label in the plot function (which I think this might be easier)?
The data is dput(Titration.8.31):
structure(list(Dilution = c(300L, 900L, 2700L, 8100L, 24300L,
72900L, 218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L,
300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L, 300L, 900L,
2700L, 8100L, 24300L, 72900L, 218700L, 300L, 900L, 2700L, 8100L,
24300L, 72900L, 218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L,
218700L, 300L, 900L, 2700L, 8100L, 24300L, 72900L, 218700L),
X..bound = c(92.43, 92.95, 92.26, 86.55, 67.49, 21.86, 0.72,
89.57, 87.84, 82.35, 65.84, 24.18, 3.56, 0.32, 91.63, 90.57,
87.22, 77.03, 39.52, 5.39, 1.24, 93.51, 93.56, 90.33, 80.49,
38.97, 4.7, 0.93, 95.37, 94.44, 91.24, 77.74, 28.76, 2.14,
0.15, 0.01, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0),
Sample = c("CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA",
"CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA", "CoV77-39 1mer 0DA",
"CoV77-39 1mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA",
"CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA",
"CoV77-39 5mer 0DA", "CoV77-39 5mer 0DA", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG", "CoV77-39 5mer 2DA GGG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDGDG",
"CoV77-39 5mer 2DA GDGDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG",
"CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG",
"CoV77-39 5mer 2DA GDG", "CoV77-39 5mer 2DA GDG", "CoV77-39 HA",
"CoV77-39 HA", "CoV77-39 HA", "CoV77-39 HA", "CoV77-39 HA",
"CoV77-39 HA", "CoV77-39 HA", "CoV77-39 WT", "CoV77-39 WT",
"CoV77-39 WT", "CoV77-39 WT", "CoV77-39 WT", "CoV77-39 WT",
"CoV77-39 WT")), class = "data.frame", row.names = c(NA,
-49L))
any help is appreciated and very welcome as I am very new to coding :) Thank you in advance for your time!! It is really appreciated as I really am stuck

How to color based on the last value on a time series line graph?

I have the following data (mydata):
structure(list(Provinsi = c("ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH"), Persenproposi = c(14.8500365764448, 15.5075939248601, 16.6821994408201, 20.0239808153477, 21.0322580645161, 22.1628838451268)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
I am plotting the Proporsi (y-axis) against Date (x-axis) for each provinces (using facet_wrap and geom_line).
mydata_ends <- mydata %>%
group_by(Provinsi) %>%
top_n(1, Date)
ggplot(data=mydata, aes(x=Date,y=Proporsi, colors (if(Proporsi>=0.4){"RED"}, else if (Proporsi<=0.2){"GREEN"}), else {"Yellow"}) +
xlab("Tanggal")+
ylab("Proporsi Keterisian TT/Kasus Aktif")+
scale_y_continuous(labels = scales::percent_format(accurracy=1))+
geom_line() +
geom_text_repel(aes(label = Proporsi), data = mydata_ends, size = 3)+
facet_wrap(~Provinsi, ncol=5) +
ggtitle("Proporsi KeterisianTT/Kasus Aktif") +
theme(plot.title = element_text(family="Trebuchet MS", face="bold", size=20, hjust=0, color="#555555")) +
theme(axis.text.x = element_text(angle=90))
I have been trying with the ggrepel and other coloring if functions but they didn't work.
Any help is appreciated, many thanks!
structure(list(Provinsi = c("ACEH", "ACEH", "ACEH", "ACEH", "ACEH",
"ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH",
"ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH",
"ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "ACEH", "BALI", "BALI",
"BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI",
"BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI",
"BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI", "BALI",
"BALI", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN",
"BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN",
"BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN",
"BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN", "BANTEN",
"BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU",
"BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU",
"BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU",
"BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU", "BENGKULU",
"BENGKULU", "BENGKULU", "BENGKULU", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA",
"DAERAH ISTIMEWA YOGYAKARTA", "DAERAH ISTIMEWA YOGYAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA",
"DKI JAKARTA", "GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO",
"GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO",
"GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO",
"GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO",
"GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO", "GORONTALO",
"GORONTALO", "GORONTALO", "GORONTALO", "JAMBI", "JAMBI", "JAMBI",
"JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI",
"JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI",
"JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI", "JAMBI",
"JAMBI", "JAMBI", "JAMBI", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT",
"JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT",
"JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT",
"JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT",
"JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT",
"JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA BARAT", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH", "JAWA TENGAH",
"JAWA TENGAH", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR",
"JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR",
"JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR",
"JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR",
"JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR",
"JAWA TIMUR", "JAWA TIMUR", "JAWA TIMUR"), Date = structure(c(1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400, 1633392000, 1633478400, 1633564800, 1633651200,
1633737600, 1633824000, 1633910400, 1633996800, 1634083200, 1634169600,
1634256000, 1634342400, 1634428800, 1634515200, 1634601600, 1634688000,
1634774400, 1634860800, 1634947200, 1635033600, 1635120000, 1635206400,
1635292800, 1635379200, 1635465600, 1635552000, 1635638400, 1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400, 1633392000, 1633478400, 1633564800, 1633651200,
1633737600, 1633824000, 1633910400, 1633996800, 1634083200, 1634169600,
1634256000, 1634342400, 1634428800, 1634515200, 1634601600, 1634688000,
1634774400, 1634860800, 1634947200, 1635033600, 1635120000, 1635206400,
1635292800, 1635379200, 1635465600, 1635552000, 1635638400, 1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400, 1633392000, 1633478400, 1633564800, 1633651200,
1633737600, 1633824000, 1633910400, 1633996800, 1634083200, 1634169600,
1634256000, 1634342400, 1634428800, 1634515200, 1634601600, 1634688000,
1634774400, 1634860800, 1634947200, 1635033600, 1635120000, 1635206400,
1635292800, 1635379200, 1635465600, 1635552000, 1635638400, 1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400, 1633392000, 1633478400, 1633564800, 1633651200,
1633737600, 1633824000, 1633910400, 1633996800, 1634083200, 1634169600,
1634256000, 1634342400, 1634428800, 1634515200, 1634601600, 1634688000,
1634774400, 1634860800, 1634947200, 1635033600, 1635120000, 1635206400,
1635292800, 1635379200, 1635465600, 1635552000, 1635638400, 1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400, 1633392000, 1633478400, 1633564800, 1633651200,
1633737600, 1633824000, 1633910400, 1633996800, 1634083200, 1634169600,
1634256000, 1634342400, 1634428800, 1634515200, 1634601600, 1634688000,
1634774400, 1634860800, 1634947200, 1635033600, 1635120000, 1635206400,
1635292800, 1635379200, 1635465600, 1635552000, 1635638400, 1633392000,
1633478400, 1633564800, 1633651200, 1633737600, 1633824000, 1633910400,
1633996800, 1634083200, 1634169600, 1634256000, 1634342400, 1634428800,
1634515200, 1634601600, 1634688000, 1634774400, 1634860800, 1634947200,
1635033600, 1635120000, 1635206400, 1635292800, 1635379200, 1635465600,
1635552000, 1635638400), tzone = "UTC", class = c("POSIXct",
"POSIXt")), Proporsi = c(0.148500365764448, 0.155075939248601,
0.166821994408201, 0.200239808153477, 0.210322580645161, 0.221628838451268,
0.245283018867925, 0.250853242320819, 0.222826086956522, 0.30859375,
0.338297872340426, 0.316901408450704, 0.372037914691943, 0.367875647668394,
0.451104100946372, 0.546938775510204, 0.533632286995516, 0.516279069767442,
0.428571428571429, 0.615384615384615, 0.617801047120419, 0.556179775280899,
0.585798816568047, 0.389937106918239, 0.705479452054795, 0.830769230769231,
0.888888888888889, 0.483547925608011, 0.448795180722892, 0.464462809917355,
0.477508650519031, 0.462522851919561, 0.453667953667954, 0.533477321814255,
0.591346153846154, 0.55050505050505, 0.551724137931034, 0.611111111111111,
0.612582781456954, 0.675409836065574, 0.726235741444867, 0.82824427480916,
0.778625954198473, 0.839662447257384, 0.990654205607477, 0.917948717948718,
0.936507936507937, 1.14838709677419, 1.26760563380282, 1.1203007518797,
1.33644859813084, 1.52083333333333, 1.6304347826087, 1.46875,
0.154362416107383, 0.141891891891892, 0.132275132275132, 0.14962962962963,
0.153030303030303, 0.163779527559055, 0.165605095541401, 0.138041733547352,
0.135514018691589, 0.135258358662614, 0.133834586466165, 0.138686131386861,
0.13543599257885, 0.138632162661738, 0.143897996357013, 0.172335600907029,
0.173027989821883, 0.161458333333333, 0.145780051150895, 0.211538461538462,
0.226053639846743, 0.285714285714286, 0.225806451612903, 0.221789883268482,
0.194756554307116, 0.189922480620155, 0.195571955719557, 0.230769230769231,
0.231578947368421, 0.345238095238095, 0.24, 0.220779220779221,
0.233766233766234, 0.208333333333333, 0.267605633802817, 0.277777777777778,
0.3125, 0.216216216216216, 0.197183098591549, 0.208333333333333,
0.208333333333333, 0.26865671641791, 0.264705882352941, 0.323529411764706,
0.348484848484849, 0.212121212121212, 0.149253731343284, 0.208955223880597,
0.258064516129032, 0.1875, 0.15625, 0.171875, 0.171875, 0.171875,
0.243654822335025, 0.238565022421525, 0.228383458646617, 0.226874391431353,
0.200205338809035, 0.190782422293676, 0.209919261822376, 0.192401960784314,
0.191157347204161, 0.207520891364903, 0.199391171993912, 0.242375601926164,
0.276632302405498, 0.305400372439479, 0.330693069306931, 0.324435318275154,
0.34341252699784, 0.330357142857143, 0.309352517985612, 0.392592592592593,
0.355329949238579, 0.380829015544041, 0.380952380952381, 0.330601092896175,
0.3359375, 0.282776349614396, 0.360824742268041, 0.69305724725944,
0.681953543776057, 0.64661214953271, 0.68118572292801, 0.422915416916617,
0.426182237600923, 0.489071038251366, 0.564325177584846, 0.534277198211624,
0.489705882352941, 0.432394366197183, 0.474926253687316, 0.479734708916728,
0.522831050228311, 0.51386748844376, 0.526771653543307, 0.529411764705882,
0.516984258492129, 0.523975588491718, 0.532637075718016, 0.573308270676692,
0.678532901833873, 0.690744920993228, 0.637139807897545, 0.61522633744856,
0.590465872156013, 0.574712643678161, 0.0579710144927536, 0.0882352941176471,
0.19047619047619, 0.142857142857143, 0.0476190476190476, 0.0869565217391304,
0.130434782608696, 0.12, 0.185185185185185, 0.166666666666667,
0.142857142857143, 0.032258064516129, 0, 0.0238095238095238,
0.0666666666666667, 0.0869565217391304, 0.0638297872340425, 0.0377358490566038,
0.0196078431372549, 0.037037037037037, 0.0166666666666667, 0,
0, 0.0344827586206897, 0.0338983050847458, 0.0344827586206897,
0.0689655172413793, 0.156976744186047, 0.126801152737752, 0.115044247787611,
0.115727002967359, 0.107361963190184, 0.13, 0.127208480565371,
0.116541353383459, 0.125984251968504, 0.122362869198312, 0.0956521739130435,
0.103139013452915, 0.140625, 0.144736842105263, 0.227642276422764,
0.273684210526316, 0.227272727272727, 0.246753246753247, 0.287671232876712,
0.289473684210526, 0.276923076923077, 0.288461538461538, 0.3,
0.26, 0.297872340425532, 0.297872340425532, 0.272727272727273,
0.268315445636958, 0.271113243761996, 0.256658595641647, 0.225711481844946,
0.227474150664697, 0.229614807403702, 0.239833159541189, 0.249733759318424,
0.241122565864834, 0.235055136390017, 0.244874048037493, 0.211981566820276,
0.207656612529002, 0.211714460036608, 0.228842247799594, 0.232904536222072,
0.256134969325153, 0.276629570747218, 0.274821286735504, 0.268691588785047,
0.298230834035383, 0.281067556296914, 0.261344537815126, 0.25604670558799,
0.243498817966903, 0.211482558139535, 0.208860759493671, 0.145617667356798,
0.137005163511188, 0.138736263736264, 0.131660364386387, 0.133039945836154,
0.125634947511006, 0.124826147426982, 0.120970537261698, 0.153423499577346,
0.144299537231805, 0.138256762559038, 0.144809910294746, 0.154576856649396,
0.169515011547344, 0.166277440448389, 0.16196205460435, 0.169491525423729,
0.156326331216414, 0.162421912542047, 0.160140562248996, 0.165185572399373,
0.162790697674419, 0.148958333333333, 0.151090342679128, 0.142406726221755,
0.131509731720147, 0.122567069963177, 0.641673243883189, 0.624897624897625,
0.669298245614035, 0.674053554939982, 0.665053242981607, 0.650793650793651,
0.745119305856833, 0.747816593886463, 0.743243243243243, 0.795053003533569,
0.815889029003783, 0.818543046357616, 0.847856154910097, 0.956456456456456,
0.898773006134969, 0.861325115562404, 0.856910569105691, 0.911917098445596,
0.887563884156729, 0.869718309859155, 0.958254269449715, 1.01803607214429,
1.07392197125257, 0.979209979209979, 0.974576271186441, 0.943396226415094,
0.927194860813705)), row.names = c(NA, -297L), class = c("tbl_df",
"tbl", "data.frame"))
One option to color the lines based on the last value is to add a column to your dataframe which contains the last value for each group and could be used to set the color. To get the right colors you have to make use of scale_color_manual. I wasn't sure about how you want to place the label. In my opinion it's not absolutely necessary to make use of ggrepel. Instead I made use of geom_text to place the labels on the right of the last data point. To this end I increased the expansion of the x scale to make room for the labels. But feel free to switch to ggrepel.
library(ggplot2)
library(dplyr)
mydata <- mydata %>%
group_by(Provinsi) %>%
mutate(
Proporsi_last = last(Proporsi),
color = case_when(Proporsi_last >= 0.4 ~ "RED", Proporsi_last <= 0.2 ~ "GREEN", TRUE ~ "YELLOW")
)
mydata_ends <- mydata %>%
group_by(Provinsi) %>%
top_n(1, Date)
ggplot(data = mydata, aes(x = Date, y = Proporsi, color = color)) +
xlab("Tanggal") +
ylab("Proporsi Keterisian TT/Kasus Aktif") +
scale_y_continuous(labels = scales::percent_format(accurracy = 1)) +
scale_color_manual(values = c(RED = "red", GREEN = "green", YELLOW = "yellow")) +
geom_line() +
geom_text(aes(label = scales::percent(Proporsi)), data = mydata_ends, size = 3, hjust = -.1, color = "black") +
scale_x_datetime(expand = expansion(mult = c(.05, .4))) +
facet_wrap(~Provinsi, ncol = 4) +
ggtitle("Proporsi KeterisianTT/Kasus Aktif") +
theme(plot.title = element_text(family = "Trebuchet MS", face = "bold", size = 20, hjust = 0, color = "#555555")) +
theme(axis.text.x = element_text(angle = 90))

How to Plot Time-series data on Horizontal bar in R?

R::How to Plot Single Horizontal Bar Showing different stages on Continous Time-series Data from startdate to present date and for navigating time a horizontal scrollbar in R?
This is my data:
var_events time_date event_duration veh_id
LD 17-06-2018 13:25 6.52 B33
WL 17-06-2018 13:25 14.52 B31
TL 17-06-2018 13:26 0.32 B32
TE 17-06-2018 13:26 4.58 B13
UL 17-06-2018 13:26 3.45 B12
WT 17-06-2018 13:26 5.46 B25
UL 17-06-2018 13:26 1.56 B17
TL 17-06-2018 13:26 13.6 B33
SL 17-06-2018 13:26 0.05 B32
Here is a Example of line chart of Previous code:
require(ggplot2)
require(dplyr)
df = structure(list(Event_stage = c("SE", "MN", "MN", "TE", "TE", "TE", "TE", "TE", "TE", "TE", "TE", "WL", "TE", "TE", "SE", "TE", "TE", "WL", "WT", "MN", "WL", "TE", "WL", "WL", "WT", "WL", "LD", "WT", "WL", "WT", "WT", "TE", "WL", "LD", "WT", "LD", "MN", "TL", "TE", "WL", "TL", "TL", "WT", "TE", "TE", "LD", "WT", "TL", "LD" ), event_date = structure(c(1529573704, 1529573710, 1529573713, 1529573724, 1529573855, 1529573874, 1529573880, 1529573895, 1529573906, 1529573918, 1529573925, 1529573931, 1529573931, 1529573941, 1529573947, 1529573969, 1529574006, 1529574054, 1529574088, 1529574114, 1529574120, 1529574123, 1529574134, 1529574137, 1529574148, 1529574163, 1529574164, 1529574148, 1529574169, 1529574170, 1529574178, 1529574188, 1529574189, 1529574196, 1529574178, 1529574188, 1529574203, 1529574213, 1529574214, 1529574214, 1529574215, 1529574227, 1529574231, 1529574242, 1529574244, 1529574245, 1529574248, 1529574260, 1529574262), class = c("POSIXct", "POSIXt"), tzone = "UTC"), stage_duration = c(3.78, 3.47, 2.78, 3.45, 3.32, 4.93, 4.23, 4.22, 3.85, 3.37, 5.88, 5.92, 3.97, 3.7, NA, 4.08, 3.05, 0.57, 11.18, 12.08, 2.6, 3.3, 0.23, 0.85, 0.27, 0.25, 0.82, 10.42, 0.15, 0.43, 1.4, 0.25, 0.7, 0.52, 1.12, 0.45, 12.87, 12.18, 2.92, 0.57, 14.07, 12.72, 17.12, 4.13, 3.13, 0.25, 0.33, 18.98, 1.05), veh_id = c("B35", "B05", "B04", "B08", "B14", "B13", "B04", "B17", "B41", "B05", "B26", "B08", "B35", "B19a", "B10a", "B01a", "B28", "B14", "B14", "B18", "B05", "B37", "B04", "B41", "B04", "B19a", "B04", "B17", "B35", "B13", "B35", "B02b", "B28", "B13", "B19a", "B41", "B02b", "B04", "B15", "B01a", "B41", "B13", "B28", "B27", "B33", "B19a", "B01a", "B19a", "B35")), .Names = c("Event_stage", "event_date", "stage_duration", "veh_id"), row.names = c(NA, -49L), class = c("tbl_df", "tbl", "data.frame"))
# create ggplot
ggplot(data = df %>% filter(veh_id == "B35"), aes(x = event_date,
y = stage_duration)) +
geom_point(aes(color = Event_stage), size= 3) +
geom_line(alpha = 1/2)+
labs(x = "Event date", y = "Stage duration")
enter image description here
This is Sample bar plot, Everything same as in above line chart but instead of line with spikes a Horizontal line or I just want a single bar which is interactive with a Slider/Scrollbar to navigate time ::
enter image description here
Something resembling this plot,But only a Single Horizontal bar with a scrollbar from start-time to present-time::
enter image description here
df %>% filter(veh_id == "B35") %>%
ggplot(
aes(
x = event_date,
y = stage_duration)
) +
geom_bar(stat = "identity") +
labs(x = "Event date", y = "Stage duration") +
coord_flip()

Display second Y axis using dygraph

Trying to have two Y axis with different scales. The second Y axis scale doesn't show and the data plotted almost entirely off screen. This is what I have:
dygraph(bmsp1, main = "Black MO SP")%>%
dyAxis("y", label = "Depth (m) ", valueRange = c(0, 1.0))%>%
dyAxis("y2", label = "Temp (c) ", valueRange = c(0, 25.0))
Plot of depth and temp
I also tried this but get the error:
dygraph(bmsp1, main = "Black MO SP")%>%
+ dyAxis("y", label = "Depth (m) ", valueRange = c(0, 1.0))%>%
+ dyAxis("y2", label = "Temp (c) ", valueRange = c(0, 25.0))%>%
+ dyAxis("Temp", axis('y2'))
Error in dyAxis(., "Temp", axis("y2"))
I haven't figured out how to add the data using the dput() (file size too large). Here is snapshot from head()
> head(bmsp1)
Depth Temp (c)
2015-09-30 09:00:00 0.003 21.378
2015-09-30 09:15:00 0.228 17.475
2015-09-30 09:30:00 0.228 17.475
2015-09-30 09:45:00 0.224 17.475
2015-09-30 10:00:00 0.225 17.475
2015-09-30 10:15:00 0.224 17.475
Here is dput() for 75 rows (I think).
> dput(head(bmsp1, 75))
structure(c(0.003, 0.228, 0.228, 0.224, 0.225, 0.224, 0.227,
0.226, 0.23, 0.218, 0.223, 0.224, 0.229, 0.226, 0.226, 0.222,
0.228, 0.233, 0.233, 0.233, 0.232, 0.225, 0.217, 0.209, 0.204,
0.212, 0.222, 0.212, 0.23, 0.224, 0.216, 0.228, 0.231, 0.23,
0.223, 0.223, 0.232, 0.224, 0.223, 0.225, 0.224, 0.219, 0.215,
0.211, 0.211, 0.215, 0.221, 0.213, 0.216, 0.222, 0.222, 0.224,
0.217, 0.212, 0.214, 0.212, 0.209, 0.21, 0.207, 0.207, 0.206,
0.205, 0.204, 0.204, 0.203, 0.198, 0.197, 0.199, 0.194, 0.184,
0.179, 0.189, 0.195, 0.192, 0.19, 21.378, 17.475, 17.475, 17.475,
17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475,
17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475,
17.475, 17.57, 17.57, 17.57, 17.57, 17.57, 17.475, 17.57, 17.475,
17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475, 17.475,
17.475, 17.379, 17.379, 17.379, 17.379, 17.379, 17.379, 17.379,
17.379, 17.284, 17.284, 17.284, 17.284, 17.284, 17.284, 17.189,
17.189, 17.189, 17.189, 17.094, 17.094, 17.094, 17.094, 16.999,
16.999, 16.999, 16.999, 16.903, 16.903, 16.903, 16.903, 16.903,
16.808, 16.808, 16.808, 16.808, 16.713, 16.713), .indexTZ = "UTC", .indexCLASS = c("POSIXct",
"POSIXt"), tclass = c("POSIXct", "POSIXt"), tzone = "UTC", class = c("xts",
"zoo"), index = structure(c(1443603600, 1443604500, 1443605400,
1443606300, 1443607200, 1443608100, 1443609000, 1443609900, 1443610800,
1443611700, 1443612600, 1443613500, 1443614400, 1443615300, 1443616200,
1443617100, 1443618000, 1443618900, 1443619800, 1443620700, 1443621600,
1443622500, 1443623400, 1443624300, 1443625200, 1443626100, 1443627000,
1443627900, 1443628800, 1443629700, 1443630600, 1443631500, 1443632400,
1443633300, 1443634200, 1443635100, 1443636000, 1443636900, 1443637800,
1443638700, 1443639600, 1443640500, 1443641400, 1443642300, 1443643200,
1443644100, 1443645000, 1443645900, 1443646800, 1443647700, 1443648600,
1443649500, 1443650400, 1443651300, 1443652200, 1443653100, 1443654000,
1443654900, 1443655800, 1443656700, 1443657600, 1443658500, 1443659400,
1443660300, 1443661200, 1443662100, 1443663000, 1443663900, 1443664800,
1443665700, 1443666600, 1443667500, 1443668400, 1443669300, 1443670200
), tzone = "UTC", tclass = c("POSIXct", "POSIXt")), .Dim = c(75L,
2L), .Dimnames = list(NULL, c("Depth", "Temp")))
this will work. there was "=" missing in your last line.
dygraph(bmsp1, main = "Black MO SP")%>%
dyAxis("y", label = "Depth", valueRange = c(0, 1.0), independentTicks = TRUE)%>%
dyAxis("y2", label = "Temp ", valueRange = c(0, 25.0), independentTicks = TRUE) %>%
dySeries("Temp", axis=('y2'))

Resources