This is my first time asking a question here so I apologize in advance if I haven't given enough information.
I have the following data frame:
Latitude <- c("-108.6125","-108.5114","-108.805","-108.4014","-108.5615",
"-108.8349","-108.225","-108.3139","-108.5568","-108.4968")
Longitude <- c("39.02205","39.22255","39.598","38.89478","39.06429",
"39.27625","39.03","39.1306","39.14823","38.89795")
Depth <- c("60.7735","56.45783","49.65","60.15","50","53.95417",
"50.825","56","55.843","38.73333")
Salinity <- c("35","34","34.5","36","32","33.5","35","34","35","33")
ctd <- data.frame(x = as.numeric(Latitude),
y = as.numeric(Longitude),
z = as.numeric(Depth),
a = as.numeric(Salinity))
I am trying to produce a 3D plot of the variables Latitude, Longitude, Depth, and an environmental parameter (e.g. Salinity/Temperature etc.) from multiple CTD transects. What I am trying to do looks like this:
I have tried using plotly with the plot_ly function, which is fairly close to what I want:
plot_ly(x=ctd$x, y=ctd$y, z=ctd$z) %>% add_markers(color = ctd$a)
However, I can't work out how to add interpolated data to plotly. I've previously used mba.surfwhen only wanting to plot Latitude, Depth, and e.g. Salinity but from what I understand mba.surf only accepts 3 variables.
I hope that makes sense. Thanks in advance!
Related
I would like to automate an analysis I have been doing with Graphpad Prism with R, but apparently it is harder than I thought.
I have Voltage~Time data that I would like to integrate and plot. In Graphpad Prism, this is performed by Analysis -> Integrate -> Create the Integral.
Here blow I plot the data in Prism and I plot the trace that I got from the Plot Integral command.
How can I do that with R?
The data I used are similar to these:
Time <- seq(1,100,1)
Voltage <- sample(1:1000,100, replace = F)
I tried integrate(), but that requires a function to integrate, which I do not have, and gives me just a number.
I tried approxfun() and I could create a function of my data but again, as soon as I apply 'integrate()' I only got a single value.
Do you have any ideas on what the Graphpad Prism function does and how I can translate that to R?
Thank you for the help!
With discrete values you can use cumsum:
set.seed(1)
Time <- seq(1,100,1)
Voltage <- sample(1:1000,100, replace = F)
df = data.frame(Time, Voltage)
library(ggplot2)
p1 <- ggplot(data = df)+
geom_line(aes(x = Time, y = Voltage))
p2 <- ggplot(data = df)+
geom_line(aes(x = Time, y = cumsum(Voltage)))
library(gridExtra)
grid.arrange(p1, p2)][1]][1]
For unevenly spaced time values, you would want to calculate:
cumsum(df$Voltage[1:(nrow(df)-1)]) * diff(df$Time)
I have approx. 4800 lat/lon points with a noise level, that I would like to draw contours on (e.g. 50dB, 55dB, ...). You can have a look at the data here: https://pastebin.com/LkfWYwJe
When I run
ggplot(
data,
aes(x =Lat, Lon, z = Value)
) + stat_contour(binwidth = 10)
I receive a
Warning message:
Not possible to generate contour data
Unfortunately I don't have any idea why this happens. Derived from other questions here, I tried less data, but this did not have an effect.
Any hint/advice/remark is higly appreciated. Thanks!
Edit: The problem seems to be unrelated to data not building a grid. I uploaded new sample data, which forms a perfect grid. https://pastebin.com/F4c7hWcY This dataset shows the very same issue as described above.
Your data file doesn't seem to have pairs of values for each possible combination of Lat and Long - instead, every Lat value is only present one time in the data.frame. The same holds true for the Lon varibale:
data[data$Lat == data$Lat[1],]
results in
# X Lat Lon DayNoise
# 1 98 12.69871 52.49891 31.70291
When you round the data it kind of works:
data$Lat <- round(data$Lat,digits = 3)
data$Lon <- round(data$Lon,digits = 3)
ggplot(data, aes(x=Lat, y=Lon, z=Value)) +
stat_contour(binwidth=10)
I'm currently analysing some data I've retrieved from a survey and I want to create a histogram with it.
The problem is that the data is in pairs of range-absolute frequency, something like with different ranges:
Since the intervals are not the same, how can I generate the histogram in R?
Thank you in advance.
I think you want a bar chart instead of a histogram. Here's an article that explains the difference nicely.
For a barchart with the data you provided in the format you've indicated you could do something like this:
my_data <- data.frame(range = c('[0-2]','[2-5]','[5-9]'),
abs_frequency = c(2,10,5))
library(ggplot2)
plot <- ggplot(data = my_data, aes(x = range, y = abs_frequency))
plot +
geom_bar(stat="identity")
I have some time series data, I've used a few functions to find specific x,y coordinates and wish to have them highlighted on the facet plot I've created. Not even sure if it's possible.
# create sample data
t<-seq(1:100)
a<-rnorm(1:100)
b<-rnorm(1:100)
c<-rnorm(1:100)
g <-as.data.frame(cbind(t,a,b,c))
g <- melt(g,id="t")
# current facet graph
ggplot(g,aes(x=t,y=value,color=variable))+
geom_point()+
facet_wrap(~variable)
This looks along the lines of what I've got right now. But I've also got the additional data below, which is a dataframe of x,y coordinates.
# sample data with x,y coords
x1 <- c(10,11,15)
y1 <- c(5,6,9)
x2 <- c(50,41,35)
y2 <- c(25,27,19)
xy<-rbind(x1,y1,x2,y2)
colnames(xy)<-c("a","b","c")
I'm not sure how to make this happen. I'd like the coordinates to be graphed in their individual plots.
Thanks for the help above, as you guys suspected the format of my data was incorrect. I'm still a novice and so collecting and formatting the data is often not as straightforward.
#the format of my 'primary' dataframe
t<-seq(1:100)
a<-rnorm(1:100)
b<-rnorm(1:100)
c<-rnorm(1:100)
g <-as.data.frame(cbind(t,a,b,c))
g <- melt(g,id="t")
#and then converting the original dataframe above to something that matches
xy<-data.frame(t = c(10,11,15,50,41,35),variable=c('a','b','c'),value = c(5,6,9,25,27,19))
as soon as I converted the df to the correct format it became much easier to fiddle around with.
ggplot(g,aes(x=t,y=value,color=variable))+
geom_point()+
geom_point(data=xy,size=4) +
facet_wrap(~variable)
The below picture shows the result. Perhaps a better title to the question would be Plot multiple dataframes on a single facet
I started learning R for data analysis and, most importantly, for data visualisation.
Since I am still in the switching process, I am trying to reproduce the activities I was doing with Graphpad Prism or Origin Pro in R. In most of the cases everything was smooth, but I could not find a smart solution for plotting multiple y columns in a single graph.
What I usually get from the softwares I use for data visualisations look like this:
Each single black trace is a measurement, and I would like to obtain the same plot in R. In Prism or Origin, this will take a single copy-paste in a XY graph.
I exported the matrix of data (one X, which indicates the time, and multiple Y values, which are the traces you see in the image).
I imported my data in R with the following commands:
library(ggplot2) #loaded ggplot2
Data <- read.csv("Directory/File.txt", header=F, sep="") #imported data
DF <- data.frame(Data) #transformed data into data frame
If I plot my data now, I obtain a series of columns, where the first one (called V1) is the X axis and all the others (V2 to V140) are the traces I want to put on the same graph.
To plot the data, I tried different solutions:
ggplot(data=DF, aes(x=DF$V1, y=DF[V2:V140]))+geom_line()+theme_bw() #did not work
plot(DF, xy.coords(x=DF$V1, y=DF$V2:V140)) #gives me an error
plot(DF, xy.coords(x=V1, y=c(V2:V10))) #gives me an error
I tried the matplot, without success, following the EZH guide:
The code I used is the following: matplot(x=DF$V1, type="l", lty = 2:100)
The only solution I found would be to individually plot a command for each single column, but it is a crazy solution. The number of columns varies among my data, and manually enter commands for 140 columns is insane.
What would you suggest?
Thank you in advance.
Here there are also some data attached.Data: single X, multiple Y
I tried using the matplot(). I used a very sample data which has no trend at all. so th eoutput from my code shall look terrible, but my main focus is on the code. Since you have already tried matplot() ,just recheck with below solution if you had done it right!
set.seed(100)
df = matrix(sample(1:685765,50000,replace = T),ncol = 100)
colnames(df)=c("x",paste0("y", 1:99))
dt=as.data.frame(df)
matplot(dt[["x"]], y = dt[,c(paste0("y",1:99))], type = "l")
If you want to plot in base R, you have to make a plot and add lines one at a time, however that isn't hard to do.
we start by making some sample data. Since the data in the link seemed to all be on the same scale, I will assume your data frame only has y values and the x value is stored separately.
plotData <- as.data.frame(matrix(sort(rnorm(500)),ncol = 5))
xval <- sort(sample(200, 100))
Now we can initialize a plot with the first column.
plot(xval, plotData[[1]], type = "l",
ylim = c(min(plotData), max(plotData)))
type = "l" makes a line plot instead of a scatter plot
ylim = c(min(plotData), max(plotData)) makes sure the y-axis will fit all the data.
Now we can add the rest of the values.
apply(plotData[-1], 2, lines, x = xval)
plotData[-1] removes the column we already plotted,
apply function with 2 as the second parameter means we want to execute a function on every column,
lines defines the function we are applying to the columns. lines adds a new line to the current plot.
x = xval passes an extra parameter (x) to the lines function.
if you wat to plot the data using ggplot2, the data should be transformed to long format;
library(ggplot2)
library(reshape2)
dat <- read.delim('AP.txt', header = F)
# plotting only first 9 traces
# my rstudio will crach if I plot the full data;
df <- melt(dat[1:10], id.vars = 'V1')
ggplot(df, aes(x = V1, y = value, color = variable)) + geom_line()
# if you want all traces to be in same colour, you can use
ggplot(df, aes(x = V1, y = value, group = variable)) + geom_line()