I'm new to R and I'm trying to define a function in R where I call another function already in a R package (pgls and sma). I'm not sure how to do it or even if it is possible.
I have tried the following:
For pgls
getpgls <- function(P1, P2, dataf){
PGLSt <- pgls(log(P1)~log(P2), data = dataf, lambda = 'ML')
}
When I call the function:
getpgls(sym('Long'), sym('massAvg'), CompData)
I get:
Error in log(P1) : non-numeric argument to mathematical function
Something similar happens with the sma function:
getsma <- function(P1, P2, dataf){
SMAt <- sma(P1~P2,
log = "xy",
data = dataf,
)
}
when I call the function:
getsma(sym('Long'), sym('massAvg'), Data_Animal_de_pd)
I get the following error:
Error in model.frame.default(formula = P1 ~ P2, data = dataf, drop.unused.levels = TRUE) :
object is not a matrix
When I run both pgls and sma with the same argumerts, but outside the function, it runs just fine.
ie.
Long.SMA <- sma(Long~massAvg,
log = "xy",
data = Data_Animal_de_pd,
)
and
Long.PGLS = pgls(log(Long)~log(massAvg), data = CompData, lambda = 'ML')
EDIT:
Here I include small versions of CompData and Data_Animal_de_pd (only with 10 animals and the parameters massAvg and Long).
The class of CompData is "comparative.data" and comes from a function comparative.data which connects a phylogenetic tree with another data frame (Data_Animal_de_pd).
> dput(CompData)
structure(list(phy = structure(list(edge = structure(c(11L, 12L,
13L, 14L, 14L, 15L, 15L, 16L, 17L, 17L, 16L, 13L, 12L, 18L, 18L,
11L, 19L, 19L, 12L, 13L, 14L, 1L, 15L, 2L, 16L, 17L, 3L, 4L,
5L, 6L, 18L, 7L, 8L, 19L, 9L, 10L), dim = c(18L, 2L)), edge.length = c(100.597661,
5.254328, 4.311278, 71.0845800943, 34.327960646, 36.7566030561,
5.779375747, 15.0619109945, 15.9153248095, 15.9153245794, 30.9772360366,
75.39586827, 44.21113726, 36.439042146, 36.4390420969, 108.977279909,
72.27059073, 72.270578302), Nnode = 9L, tip.label = c("Tupaia_minor",
"Hystrix_cristata", "Geocapromys_brownii", "Myocastor_coypus",
"Hydrochoerus_hydrochaeris", "Rhinoceros_sondaicus", "Dasypus_hybridus",
"Tolypeutes_matacus", "Caluromysiops_irrupta", "Acrobates_pygmaeus"
), node.label = 11:19), class = "phylo", order = "cladewise"),
data = structure(list(massAvg = c(0.045, 20, 1.5, 7.5, 50.5,
1350, 5.5, 1.5, 0.45, 0.01), Long = c(21.565, 110.4, 55.52,
68.3266666666667, 175.2, 447.4, 47.02, 44.68, 38.58, 12.67
)), row.names = c("Tupaia_minor", "Hystrix_cristata", "Geocapromys_brownii",
"Myocastor_coypus", "Hydrochoerus_hydrochaeris", "Rhinoceros_sondaicus",
"Dasypus_hybridus", "Tolypeutes_matacus", "Caluromysiops_irrupta",
"Acrobates_pygmaeus"), class = "data.frame"), data.name = "datanm2[, c(\"massAvg\", \"Long\", \"Sci_name2\")]",
phy.name = "newphy", dropped = list(tips = character(0),
unmatched.rows = character(0))), class = "comparative.data")
Data_Animal_de_pd is a data frame that contains the information of the animals such as the length of the bones, etc.
> dput(Data_Animal_de_pd)
structure(list(massAvg = c(20, 50.5, 7.5, 1350, 0.45, 0.045,
1.5, 5.5, 1.5, 0.01), Long = c(110.4, 175.2, 68.3266666666667,
447.4, 38.58, 21.565, 55.52, 47.02, 44.68, 12.67), Sci_name = c("Hystrix cristata",
"Hydrochoerus hydrochaeris", "Myocastor coypus", "Rhinoceros sondaicus",
"Caluromysiops irrupta", "Tupaia minor", "Geocapromys brownii",
"Dasypus hybridus", "Tolypeutes matacus", "Acrobates pygmaeus"
), Sci_name2 = c("Hystrix_cristata", "Hydrochoerus_hydrochaeris",
"Myocastor_coypus", "Rhinoceros_sondaicus", "Caluromysiops_irrupta",
"Tupaia_minor", "Geocapromys_brownii", "Dasypus_hybridus", "Tolypeutes_matacus",
"Acrobates_pygmaeus")), row.names = c("10137", "10149", "10157",
"102233", "126286", "143289", "1543402", "1756220", "183749",
"190720"), class = "data.frame")```
To make your function work with symbols (i assume from rlang::sym) you must inject them with rlang::inject:
getsma <- function(P1, P2, dataf){
SMAt <- rlang::inject(sma(!!P1 ~ !!P2,
log = "xy",
data = dataf,
))
}
but you can instead substitute and inject arguments:
getsma <- function(P1, P2, dataf){
P1 <- rlang::enexpr(P1)
P2 <- rlang::enexpr(P2)
SMAt <- rlang::inject(sma(!!P1 ~ !!P2,
log = "xy",
data = dataf,
))
}
Then call them directly:
getsma(Long, massAvg, Data_Animal_de_pd)
Related
I would like to plot an interactive heatmap, where the column widths are different.
Although I managed to get different cell widths, the widths do not correspond to the values and the ordering is not correct.
The order of the x-axis should remain the same as the segments column in the df data.frame.
If the heatmap doesn't work, I would also be fine with a stacked barchart.
df <- structure(list(
segments = c(101493L, 101493L, 101493L, 101492L, 101492L, 101492L, 101494L, 101494L, 101494L, 102018L, 102018L,
102018L, 102018L, 102018L, 102019L, 102019L, 102019L, 102019L, 102019L),
timestamp = structure(c(1579233600, 1579240800, 1579248000,
1579233600, 1579240800, 1579248000, 1579233600, 1579240800, 1579248000,
1579219200, 1579226400, 1579233600, 1579240800, 1579248000, 1579219200,
1579226400, 1579233600, 1579240800, 1579248000), class = c("POSIXct", "POSIXt"), tzone = "Europe/Berlin"),
value = c(91.772, 91.923, 96.968, 104.307, 101.435, 105.539, 104.879, 104.197, 103.038,
96.403, 90.926, 111.807, 115.931, 111.729, 100.129, 86.903, 108.22, 117.841, 112.293),
width = c(5L, 5L, 5L, 2L, 2L, 2L, 3L, 3L, 3L, 10L, 10L, 10L, 10L, 10L, 9L, 9L, 9L, 9L, 9L)),
row.names = c(1L, 2L, 3L, 11L, 12L, 13L, 21L, 22L, 23L, 31L, 32L, 33L, 34L, 35L,43L, 44L, 45L, 46L, 47L),
class = "data.frame")
library(plotly)
plot_ly(data = df) %>%
add_trace(type="heatmap",
x = ~as.character(width),
y = ~timestamp,
z = ~value,
xgap = 0.2, ygap = 0.2) %>%
plotly::layout(xaxis = list(rangemode = "nonnegative",
tickmode = "array",
tickvals=as.character(unique(df$width)),
ticktext=as.character(unique(df$segments)),
zeroline = FALSE))
By giving Plotly a matrix for the z-values it seems to work and the widths are respected.
df$newx <- rep(cumsum(df[!duplicated(df$segments),]$width), rle(df$segments)$length)
mappdf <- expand.grid(timestamp=unique(df$timestamp), newx=unique(df$newx))
mappdf <- merge(mappdf, df[,c("timestamp","value","newx")], all.x = T, all.y = F, sort = F)
mappdf <- mappdf[order(mappdf$newx, mappdf$timestamp),]
zvals <- matrix(data = mappdf$value,
nrow = length(unique(df$timestamp)),
ncol = length(unique(df$newx)))
plot_ly() %>%
add_heatmap(y = sort(unique(df$timestamp)),
x = c(0,unique(df$newx)),
z = zvals) %>%
plotly::layout(xaxis = list(
title = "",
tickvals=unique(df$newx),
ticktext=paste(unique(df$segments), "-", unique(df$width))
))
I have a df as follow:
Variable Value
G1_temp_0 37.9
G1_temp_5 37.95333333
G1_temp_10 37.98333333
G1_temp_15 38.18666667
G1_temp_20 38.30526316
G1_temp_25 38.33529412
G1_mean_Q1 38.03666667
G1_mean_Q2 38.08666667
G1_mean_Q3 38.01
G1_mean_Q4 38.2
G2_temp_0 37.9
G2_temp_5 37.95333333
G2_temp_10 37.98333333
G2_temp_15 38.18666667
G2_temp_20 38.30526316
G2_temp_25 38.33529412
G2_mean_Q1 38.53666667
G2_mean_Q2 38.68666667
G2_mean_Q3 38.61
G2_mean_Q4 38.71
I like to make a lineplot with two lines which reflects the values "G1_mean_Q1 - G1_mean_Q4" and "G2_mean_Q1 - G2_mean_Q4"
In the end it should more or less look like this, the x axis should represent the different variables:
The main problem I have is, how to get a basic line plot with this df.
I've tried something like this,
ggplot(df, aes(x = c(1:4), y = Value) + geom_line()
but I have always some errors. It would be great if someone could help me. Thanks
Please post your data with dput(data) next time. it makes it easier to read your data into R.
You need to tell ggplot which are the groups. You can do this with aes(group = Sample). For this purpose, you need to restructure your dataframe a bit and separate the Variable into different columns.
library(tidyverse)
dat <- structure(list(Variable = structure(c(5L, 10L, 6L, 7L, 8L, 9L,
1L, 2L, 3L, 4L, 15L, 20L, 16L, 17L, 18L, 19L, 11L, 12L, 13L,
14L), .Label = c("G1_mean_Q1", "G1_mean_Q2", "G1_mean_Q3", "G1_mean_Q4",
"G1_temp_0", "G1_temp_10", "G1_temp_15", "G1_temp_20", "G1_temp_25",
"G1_temp_5", "G2_mean_Q1", "G2_mean_Q2", "G2_mean_Q3", "G2_mean_Q4",
"G2_temp_0", "G2_temp_10", "G2_temp_15", "G2_temp_20", "G2_temp_25",
"G2_temp_5"), class = "factor"), Value = c(37.9, 37.95333333,
37.98333333, 38.18666667, 38.30526316, 38.33529412, 38.03666667,
38.08666667, 38.01, 38.2, 37.9, 37.95333333, 37.98333333, 38.18666667,
38.30526316, 38.33529412, 38.53666667, 38.68666667, 38.61, 38.71
)), class = "data.frame", row.names = c(NA, -20L))
dat <- dat %>%
filter(str_detect(Variable, "mean")) %>%
separate(Variable, into = c("Sample", "mean", "time"), sep = "_")
g <- ggplot(data=dat, aes(x=time, y=Value, group=Sample)) +
geom_line(aes(colour=Sample))
g
Created on 2020-07-20 by the reprex package (v0.3.0)
I have a dataframe with the following data
my2016.regression.dataframe <- structure(list(Economy_Directorate = structure(c(9L, 1L, 18L,
11L, 5L, 7L), .Label = c("20128895", "25392278", "26802176",
"33214069", "34194316", "34863777", "34867843", "36497785", "37280694",
"37411816", "44460126", "45484123", "47463441", "48354697", "57954259",
"60187650", "65135916", "67317188"), class = "factor"), People_Directorate = structure(c(12L,
14L, 17L, 16L, 13L, 15L), .Label = c("20128895", "25392278",
"26802176", "33214069", "34194316", "34863777", "34867843", "36497785",
"37280694", "37411816", "44460126", "45484123", "47463441", "48354697",
"57954259", "60187650", "65135916", "67317188"), class = "factor")), .Names = c("Economy_Directorate",
"People_Directorate"), row.names = c(NA, -6L), class = "data.frame")
I used the following code to plot it. it plotts the points, but it does not plot the lm .
Could you help me why it does not plot the the lm in the geom_smooth
library(ggplot2)
ggplot(data =my2016.regression.dataframe )+
geom_point(aes(y=Economy_Directorate,x=People_Directorate))+
geom_smooth(method = "lm",aes(y=Economy_Directorate,x=People_Directorate),
fill="orange",colour="red")
Regards,
You need to convert your columns to numeric types. They are currently factors:
my2016.regression.dataframe$Economy_Directorate = as.numeric(as.character(my2016.regression.dataframe$Economy_Directorate))
my2016.regression.dataframe$People_Directorate = as.numeric(as.character(my2016.regression.dataframe$People_Directorate))
ggplot(data = my2016.regression.dataframe) +
geom_point(aes(y=Economy_Directorate,x=People_Directorate))+
geom_smooth(method = "lm",aes(y=Economy_Directorate,x=People_Directorate),
fill="orange",colour="red")
I'm pretty new to R so I don't really know what I'm doing. Anyway, I have data in this format in excel (as a csv file):
dt <- data.frame(species = rep(c("a", "b", "c"), each = 4),
cover = rep(1:3, times = 4),
depth = rep(c(15, 30, 60, 90), times = 3),
stringsAsFactors = FALSE)
I want to plot a graph of cover against depth, with a different coloured line for each species, and a key for which species is which colour. I don't even know where to start.
Sorry if something similar has been asked before. Any help would be much appreciated!
Don't know if this is in a helpful format but here's some of the actual data, I need to read more about dput I think:
structure(list(species = structure(c(1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L,
11L), .Label = c("Agaricia fragilis", "bryozoan", "Dichocoenia stokesi",
"Diploria labyrinthiformis", "Diploria strigosa", "Madracis decactis",
"Manicina", "Montastrea cavernosa", "Orbicella franksi", "Porites asteroides",
"Siderastrea radians"), class = "factor"), cover = c(0.021212121,
0.04047619, 0, 0, 0, 0, 1.266666667, 4.269047619, 3.587878788,
3.25, 0.118181818, 0.152380952, 0, 0.007142857, 3.806060606,
2.983333333, 14.13030303, 15.76190476, 0.415151515, 0.2, 0.26969697,
0.135714286), depth = c(30L, 15L, 30L, 15L, 30L, 15L, 30L, 15L,
30L, 15L, 30L, 15L, 30L, 15L, 30L, 15L, 30L, 15L, 30L, 15L, 30L,
15L)), .Names = c("species", "cover", "depth"), row.names = c(NA,
22L), class = "data.frame")
Here is a solution using the ggplot2 package.
# Load packages
library(ggplot2)
# Create example data frame based on the original example the OP provided
dt <- data.frame(species = rep(c("a", "b", "c"), each = 4),
cover = rep(1:3, times = 4),
depth = rep(c(15, 30, 60, 90), times = 3),
stringsAsFactors = FALSE)
# Plot the data
ggplot(dt, aes(x = depth, y = cover, group = species, colour = species)) +
geom_line()
This should get you going!
df1 <- read.csv("//file_location.csv", headers=T)
library(dplyr)
df1 <- df1 %>% select(species, depth) %>% group_by(species) %>%
summarise(mean(depth)
library(ggplot2)
ggplot(df1, aes(x=depth, y=species, group=species, color=species) +
geom_line()
When I generate a phenogram using the phytools package, the tips and tip labels of the trees are not displaying. Does anyone have any ideas on how to fix this, or another way of plotting a phenogram with nodes and tips with a y axis plotted at the value of the trait in question?
Here's what I have:
midpointData <-
structure(list(Species = structure(1:6, .Label = c("Icterus_croconotus",
"Icterus_graceannae", "Icterus_icterus", "Icterus_jamacaii",
"Icterus_mesomelas", "Icterus_pectoralis"), class = "factor"),
bio_1nam = c(243L, 193L, 225L, 209L, 189L, 180L), bio_12nam = c(5127.5,
751.5, 1373, 914.5, 4043.5, 2623.5), bio_16nam = c(1470.5,
442, 656.5, 542, 1392.5, 1074), bio_17nam = c(1094.5, 51.5,
135, 189.5, 768.5, 377.5), bio_2nam = c(97.5, 91.5, 83, 82.5,
81, 102), bio_5nam = c(314, 265.5, 311, 274, 282, 281), bio_6nam = c(167.5,
132.5, 175.5, 154.5, 128, 114)), .Names = c("Species", "bio_1nam",
"bio_12nam", "bio_16nam", "bio_17nam", "bio_2nam", "bio_5nam",
"bio_6nam"), class = "data.frame", row.names = c(NA, -6L))
prunedTargetTree <-
structure(list(edge = structure(c(7L, 7L, 8L, 9L, 9L, 8L, 10L,
11L, 11L, 10L, 1L, 8L, 9L, 2L, 3L, 10L, 11L, 4L, 5L, 6L), .Dim = c(10L,
2L)), Nnode = 5L, tip.label = c("Icterus_mesomelas", "Icterus_pectoralis",
"Icterus_graceannae", "Icterus_croconotus", "Icterus_icterus",
"Icterus_jamacaii"), edge.length = c(0.152443952069696, 0.014866140819964,
0.0311847312922788, 0.106393079957453, 0.106393079957453, 0.0727572150872864,
0.0130293222294024, 0.0517912739330428, 0.0517912739330428, 0.0648205961624452
)), .Names = c("edge", "Nnode", "tip.label", "edge.length"), class = "phylo", order = "cladewise")
library(phytools)
reconBio1 <- ace(midpointData$bio_1nam, prunedTargetTree, type = "continuous", method = "ML")
bio1final <- c(reconBio1$ace, midpointData$bio_1nam)
names(bio1final) <- c(7,8,9,10,11,4,3,5,6,1,2)
plot.new()
phenogram(prunedTargetTree, bio1final, ylim = c(min(bio1final), max(bio1final)))
Here's what the tree looks like:
I have solved the problem, but wanted to share the solution in case others run into the same issue. pheonogram() looks for names in the argument x (aka bio1final) that match prunedTargetTree$tip.label, not the numeric index of the tip. Instead of:
bio1final <- c(reconBio1$ace, midpointData$bio_1nam);
names(bio1final) <- c(7,8,9,10,11,4,3,5,6,1,2)
it should read:
bio1final <- c(reconBio1$ace, midpointData$bio_1nam);
names(bio1final) <- c(7,8,9,10,11,as.character(midpointData$Species))
**as.character is important, because otherwise $Species is read in as a factor, and the tips of the tree still won't plot.