Loop in tidyverse

Loop in tidyverse - r

I am learning tidyverse() and I am using a time-series dataset, and I selected columns that start with sec. What I would like basically to identify those values from columns that equal 123, keep these and have the rest replace with 0. But I don't know how to loop from sec1:sec4. Also how can I sum() per columns?
df1<-df %>%
select(starts_with("sec")) %>%
select(ifelse("sec1:sec4"==123, 1, 0))
Sample data:
structure(list(sec1 = c(1, 123, 1), sec2 = c(123, 1, 1), sec3 = c(123,
0, 0), sec4 = c(1, 123, 1)), spec = structure(list(cols = list(
sec1 = structure(list(), class = c("collector_double", "collector"
)), sec2 = structure(list(), class = c("collector_double",
"collector")), sec3 = structure(list(), class = c("collector_double",
"collector")), sec4 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), row.names = c(NA,
-3L), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"))

I think you would have to use mutate and across to accomplish this. below you will mutate across each column starting with sec and then keep all values that are 123 and replace all others with 0.
df1<-df %>%
select(starts_with("sec")) %>%
mutate(across(starts_with("sec"),.fns = function(x){ifelse(x == 123,x,0)}))

Related

Count edges using the adjacency matrix

Based on the adjacency matrix, I would like to count the number of unique edges in a network. In the below example I coloured the unique edges between the different nodes. But I don't know how to proceed.
Desired output:
Sample data
structure(list(...1 = c("m1", "m2", "m3", "m4"), m1 = c(0.2,
0.2, 0.2, 0.3), m2 = c(0.1, 0.2, 0.2, 0.6), m3 = c(0.5, 0.2,
1, 0), m4 = c(0.3, 0, 0, 0.1)), row.names = c(NA, -4L), spec = structure(list(
cols = list(...1 = structure(list(), class = c("collector_character",
"collector")), m1 = structure(list(), class = c("collector_double",
"collector")), m2 = structure(list(), class = c("collector_double",
"collector")), m3 = structure(list(), class = c("collector_double",
"collector")), m4 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))

Assuming that this is an undirected graph such that 0 indicates no edge and a positive number indicates an edge, convert the input DF to a logical matrix and from that to an igraph object. Then get its edges and the names of those edges. (Another possible output is by using as_edgelist(g) to get a 2 column matrix such that each row defines an edge.)
If it were intended that the graph be directed then replace "undirected" with "directed" and in that case a character vector of 13 edge names will be produced instead of the 9 undirected edges shown below.
library(igraph)
m <- as.matrix(DF[-1])
rownames(m) <- colnames(m)
g <- graph_from_adjacency_matrix(m > 0, "undirected")
e <- E(g)
attr(e, "vnames")
## [1] "m1|m1" "m1|m2" "m1|m3" "m1|m4" "m2|m2" "m2|m3" "m2|m4" "m3|m3" "m4|m4"
Alternately as a pipeline
library(igraph)
library(tibble)
DF %>%
column_to_rownames("...1") %>%
as.matrix %>%
sign %>%
graph_from_adjacency_matrix("undirected") %>%
E %>%
attr("vnames")
## [1] "m1|m1" "m1|m2" "m1|m3" "m1|m4" "m2|m2" "m2|m3" "m2|m4" "m3|m3" "m4|m4"
The graph of g looks like this. (If "directed" had been chosen above then the edges would have arrowheads on them.)
set.seed(123)
plot(g)
Note
DF <-
structure(list(...1 = c("m1", "m2", "m3", "m4"), m1 = c(0.2,
0.2, 0.2, 0.3), m2 = c(0.1, 0.2, 0.2, 0.6), m3 = c(0.5, 0.2,
1, 0), m4 = c(0.3, 0, 0, 0.1)), row.names = c(NA, -4L), spec = structure(list(
cols = list(...1 = structure(list(), class = c("collector_character",
"collector")), m1 = structure(list(), class = c("collector_double",
"collector")), m2 = structure(list(), class = c("collector_double",
"collector")), m3 = structure(list(), class = c("collector_double",
"collector")), m4 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))

Standard deviation based on group id from a data frame

This relates to one of y previous question. My end goal is to rank items based on the serial variable, which is derived from a standard deviation value for the start and end of the day. To simply summarise, I would like to calculate both of them (start and end day)  and then I would like to mark it with a 1 if the standard deviations are less than 0.5. What is the best way to do this in R?
Rule that i would like to implement in R:
=IF(AND(STDEV.S(D2,D3,D4)<0.5,STDEV.P(E2, E3, E4)<0.5),1,0)
Sample data structure:
Sample output:
Sample data
df<-structure(list(serial = c(11011209, 11011209, 11011209, 11011209,
11011209, 11011210, 11011210, 11011210, 11011210), pnum = c(1,
1, 1, 2, 2, 2, 2, 2, 2), Day = c("Tue", "Wed", "Thur", "Wed",
"Thur", "Mo", "Tue", "Wed", "Thur"), Start = c(7, 7, 7, 8, 8,
9.75, 6.5, 6.5, 6.5), End = c(14.5, 14.5, 14.5, 15.75, 15.75,
17.75, 14.75, 14.75, 8.75)), class = c("spec_tbl_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -9L), spec = structure(list(
cols = list(serial = structure(list(), class = c("collector_double",
"collector")), pnum = structure(list(), class = c("collector_double",
"collector")), Day = structure(list(), class = c("collector_character",
"collector")), Start = structure(list(), class = c("collector_double",
"collector")), End = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))

After grouping by 'serial', 'num', create the 'Pattern', by checking the sd of 'Start', 'End' columns are less than 0.5 and connect the multiple expressions to a single one with &
library(dplyr)
df %>%
group_by(serial, pnum) %>%
mutate(Pattern = +(sd(Start) < 0.5 & sd(End) < 0.5)) %>%
ungroup
Or instead of specifying each column separately, use if_all
df %>%
group_by(serial, pnum) %>%
mutate(Pattern = +(if_all(c(Start, End), ~ sd(.) < 0.5))) %>%
ungroup

Unexpected behavior of filter inside a function dplyr

I have a function that filters a data.frame based on the unique values of a group column that is passed to the function
la <- function(df, grp){
gr <- df %>% pull({{grp}}) %>% unique()
purrr::map(gr, function(x){
print(x)
filter(df, {{grp}} == x)
})
}
When I use it with this df,
x <- structure(list(mac = c("dc:a6:32:21:59:2b", "dc:a6:32:2d:8c:ca",
"dc:a6:32:2d:b8:62", "dc:a6:32:2d:ca:3f"), datetime = structure(c(1594644546,
1594645457, 1594645375, 1594645080), tzone = "UTC", class = c("POSIXct",
"POSIXt")), Comment = c("FED2", "FED7", "FED1", "FED6")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L))
la(x, mac)
I get the proper prints and the subsets.
However, when I use it with this other df, which should be equivalent, it doesn't work as expected.
df <- structure(list(datetime = structure(c(1594644600, 1594644900,
1594645200, 1594645500, 1594645800, 1594646100), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), movement = c(9940.50454596681, 10779.7747307276,
7148.52826988968, 7687.54314683339, 8797.06954533588, 7524.02474093548
), x = c(606, NA, 240, NA, 504, NA), y = c(386, NA, 274, NA,
56, NA), i_x = c(606, 228, 214, 407.5, 500, 292.947368421053),
i_y = c(386, 286, 258, 49.1666666666667, 56, 234), mac = c("dc:a6:32:21:59:2b",
"dc:a6:32:21:59:2b", "dc:a6:32:21:59:2b", "dc:a6:32:21:59:2b",
"dc:a6:32:21:59:2b", "dc:a6:32:21:59:2b")), spec = structure(list(
cols = list(filename = structure(list(), class = c("collector_character",
"collector")), datetime = structure(list(format = ""), class = c("collector_datetime",
"collector")), movement = structure(list(), class = c("collector_double",
"collector")), x = structure(list(), class = c("collector_double",
"collector")), y = structure(list(), class = c("collector_double",
"collector")), i_x = structure(list(), class = c("collector_double",
"collector")), i_y = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = "\t"), class = "col_spec"), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
I get 0 rows on each type of group (my real example has the same groups as the ones for the x dataframe).
Interestingly, this works as expected.
la(select(head(df), mac, datetime), mac)
[1] "dc:a6:32:21:59:2b"
[[1]]
# A tibble: 6 x 2
mac datetime
<chr> <dttm>
1 dc:a6:32:21:59:2b 2020-07-13 12:50:00
2 dc:a6:32:21:59:2b 2020-07-13 12:55:00
3 dc:a6:32:21:59:2b 2020-07-13 13:00:00
4 dc:a6:32:21:59:2b 2020-07-13 13:05:00
5 dc:a6:32:21:59:2b 2020-07-13 13:10:00
6 dc:a6:32:21:59:2b 2020-07-13 13:15:00
What is going on?

As the comment suggests, the problem is that I have function(x) inside the map call and because df has an x column, things become weird. I chose another variable name for that, and now it's working.
la <- function(df, grp){
gr <- df %>% pull({{grp}}) %>% unique()
purrr::map(gr, function(tt){
print(tt)
filter(df, {{grp}} == tt)
})
}

Visualize bubbles on a map, using hc_add_series_map() instead of hcmap()

I am trying to visualize a bubble map, using highcharter.
I did it perfectly, using this code
library(highcharter)
library(tidyverse)
hcmap("custom/africa") %>%
hc_add_series(data = fake_data, type = "mapbubble", maxSize = '10%', color =
"Red", showInLegend = FALSE) %>%
hc_legend(enabled = FALSE)
My data
> dput(fake_data)
structure(list(country = c("DZ", "CD", "ZA", "TZ"), lat = c(28.033886,
-4.038333, -30.559482, -6.369028), lon = c(1.659626, 21.758664,
22.937506, 34.888822), name = c("Algeria", "Congo, Dem. Rep",
"South Africa", "Tanzania"), z = c(20, 5, 10, 1)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), spec =
structure(list(
cols = list(country = structure(list(), class = c("collector_character",
"collector")), lat = structure(list(), class = c("collector_double",
"collector")), lon = structure(list(), class = c("collector_double",
"collector")), name = structure(list(), class = c("collector_character",
"collector")), z = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
External geo data for Africa originally comes from this source and used with hcmap().
But I transform it into RDS and use locally. Available here.
My problem that I cannot use my code and external data due to corporate IT security restrictions. I cannot deploy this code with Shiny/RMarkdown on Connect, it is blocked.
So my solution currently
Use the same data in RDS format
africa_map_data <- readRDS("africa_map_data.RDS")
And use the hc_add_series_map() with local data instead of hcmap().
highchart() %>%
hc_add_series_map(
map = africa_map_data,
df = fake_data,
value = "z",
joinBy = c("hc-a2", "country"),
type = "mapbubble",
maxSize = '10%',
color = "Red"
)
But it does not work well, I get a mess.
How to create a bubble map with hc_add_series_map() (or any other way) without 'hcmap' and pulling external data.
Thanks!

ggplot loop deal with special characters

Hi there I'm trying to plot a defined number of graphs using gridExtra.
This is working but unfortunately it is not dealing with special characters in its name. I tried to work around by using R friendly names and add in the actual name as a subtitle
library(gridExtra)
library(ggplot2)
Dataframe<-read.csv2(File_with_R_friendly_names.csv)
names<-read.csv2(File_with_actual_names.csv)
bar<-colnames(names)
list_of_plots<-lapply(names(Dataframe)[2:10], function(i) {
ggplot(Dataframe, aes_string(x="X1", y=i)) + geom_point()+labs(x=i, y="Intensity", subtitle=bar[i])
})
do.call(grid.arrange, c(list_of_plots, ncol=3))
If I put in bar[2] all graphs get the actual name but it is the same one for all while if I set bar to i, all graphs get NA.
The names I use to suit R are
Met1, Met2, Met3, Met4, Met5, Met6, Met7, Met8, Met9 and Met10
Examples of names that I need on the plots are:
-(-)-Corey lactone
-(2R)-2,3-Dihydroxypropanoic acid
-(D-(+)-Glyceric acid?)
-1,5-Naphthalenediamine
-12-Aminododecanoic acid
-2,5-di-tert-Butylhydroquinone
-2,6-di-tert-Butylphenol
-2-Amino-N,N-diethylacetamide
-2-Ethyl-2-phenylmalonamide
-2-Naphthalenesulfonic acid
Here is the dput to reproduce the bar (names):
`bar<-c("X1", "(-)-Corey lactone", "(2R)-2,3-Dihydroxypropanoic acid (D-(+)- Glyceric acid?)", "1,5-Naphthalenediamine", "12-Aminododecanoic acid", "2,5-di- tert-Butylhydroquinone", "2,6-di-tert-Butylphenol", "2-Amino-N,N- diethylacetamide", "2-Ethyl-2-phenylmalonamide", "2-Naphthalenesulfonic acid")`
Here is the dput to reproduce the dataframe:
Dataframe<-structure(list(X1 = c(0, 0, 0.25, 0.25, 0.5, 0.5, 1, 1, 2, 2),
Met1 = c(0, 0, 38096319.85, 45978353.93, 35077691.7, 42146132.41,
62606961.17, 32786049.6, 51054004.82, 48898547.32), Met2 = c(0,
0, 1288905.771, 948466.4001, 645979.6463, 1228663.251, 1137957.136,
940928.9344, 1443680.706, 1755726.385), Met3 = c(0, 0, 575887.464,
693692.0349, 1362477.6, 1515767.293, 2241120.502, 2417932.908,
3866432.112, 3894701.876), Met4 = c(0, 0, 16737068.73, 21915551.3,
12088089.1, 16003037.3, 17720785.29, 11957614.24, 13127281.5,
14192542.13), Met5 = c(0, 0, 4556006.426, 4782909.936, 4484706.271,
8019957.826, 5112289.476, 8537488.48, 6680688.948, 5959748.061
), Met6 = c(0, 0, 16874476.32, 15721984.25, 18093323.61,
18619817.92, 22055835.04, 19754379.11, 29211315.88, 27321333.35
), Met7 = c(0, 0, 6604385.457, 6396794.568, 13823034.64,
15449539.63, 26013299.82, 20262673.28, 35301685.57, 33367520.66
), Met8 = c(0, 0, 6727973.448, 7166827.569, 13238311.46,
13986568.69, 20957194.23, 19186953.76, 34513697.47, 31192991.75
), Met9 = c(0, 0, 2373752.304, 3259738.104, 1998529.732,
2387445.15, 2479309.442, 26924139.6, 4611277.427, 2439602.098
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-10L), .Names = c("X1", "Met1", "Met2", "Met3", "Met4", "Met5",
"Met6", "Met7", "Met8", "Met9"), spec = structure(list(cols = structure(list(
X1 = structure(list(), class = c("collector_double", "collector"
)), Met1 = structure(list(), class = c("collector_double",
"collector")), Met2 = structure(list(), class = c("collector_double",
"collector")), Met3 = structure(list(), class = c("collector_double",
"collector")), Met4 = structure(list(), class = c("collector_double",
"collector")), Met5 = structure(list(), class = c("collector_double",
"collector")), Met6 = structure(list(), class = c("collector_double",
"collector")), Met7 = structure(list(), class = c("collector_double",
"collector")), Met8 = structure(list(), class = c("collector_double",
"collector")), Met9 = structure(list(), class = c("collector_double",
"collector"))), .Names = c("X1", "Met1", "Met2", "Met3",
"Met4", "Met5", "Met6", "Met7", "Met8", "Met9")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

Because names(Dataframe)[2:10] is not number. Below will work:
list_of_plots<-lapply(as.numeric(names(Dataframe)[2:10]), function(i) {
ggplot(Dataframe, aes_string(x="X1", y=i)) + geom_point()+labs(x=i,
y="Intensity", subtitle=bar[i])
})
do.call(grid.arrange, c(list_of_plots, ncol=3))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Loop in tidyverse - r

Related

Count edges using the adjacency matrix

Standard deviation based on group id from a data frame

Unexpected behavior of filter inside a function dplyr

Visualize bubbles on a map, using hc_add_series_map() instead of hcmap()

ggplot loop deal with special characters

Categories

Resources