R: Plot Axis Display Values Larger than the Original Data - r

I am using the R programming language. I am following a tutorial on data visualization over here: https://plotly.com/r/3d-surface-plots/
I created my own data and made a 3D plot:
library(plotly)
set.seed(123)
#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)
#3d plot
fig <- plot_ly(z = ~as.matrix(d))
fig <- fig %>% add_surface()
#view plot
fig
As seen here, there is a point on this 3D plot where "y = 97". I am not sure how this is possible, seeing how none of the values within the original data frame "d" are anywhere close to 97. I made sure of this by looking at the individual distributions of each variable in the original data frame "d":
#plot individual densities
plot(density(d$a), main = "density plots", col = "red")
lines(density(d$b), col = "blue")
lines(density(d$c), col = "green")
legend( "topleft", c("a", "b", "c"),
text.col=c("red", "blue", "green") )
As seen here, none of the variables (a,b,c) from the original data frame "d" have any values that are close to 97.
Thus, my question: can someone please explain how is it possible that the point (x = 0 , y = 97, z =25.326) appears on this 3D plot?
Thanks

I am not sure if this will resolve the problem - but using the same logic from this previous stackoverflow post: 3D Surface with Plot_ly in r, with x,y,z coordinates
library(plotly)
set.seed(123)
#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)
data = d
plot_ly() %>%
add_trace(data = data, x=data$a, y=data$b, z=data$c, type="mesh3d" )
Now, it appears that all values seen in this visual plot are contained in the original data frame.
However, I am still not sure what is the fundamental (and mathematical) difference between both of these plots:
I am curious to see what others have to say.
Thanks

The problem is how you have your matrix built. Basically, the z-values (in your case the c variable) should be given in a matrix in which the rows and columns are like coordinates for a surface, similar to a grid or raster dataset. The values you see now along the x and y-axis are not the values from your a and b variables but the row and column numbers from your matrix (similar to coordinates). You can open the volcano dataset in R and have a look at how these data are organized, which will surely give you a better understanding of what I am trying to explain.

As Robbie mentioned, its to do with how your data is organised. To change XYZ data to the same format as the volcano dataset, you can use the following from the raster package:
raster <- rasterFromXYZ(d)
# plot raster
plot_ly(z = as.matrix(raster), type = "surface")

Related

R function for creating a 3D plot with 4 variables?

This is my first time asking a question here so I apologize in advance if I haven't given enough information.
I have the following data frame:
Latitude <- c("-108.6125","-108.5114","-108.805","-108.4014","-108.5615",
"-108.8349","-108.225","-108.3139","-108.5568","-108.4968")
Longitude <- c("39.02205","39.22255","39.598","38.89478","39.06429",
"39.27625","39.03","39.1306","39.14823","38.89795")
Depth <- c("60.7735","56.45783","49.65","60.15","50","53.95417",
"50.825","56","55.843","38.73333")
Salinity <- c("35","34","34.5","36","32","33.5","35","34","35","33")
ctd <- data.frame(x = as.numeric(Latitude),
y = as.numeric(Longitude),
z = as.numeric(Depth),
a = as.numeric(Salinity))
I am trying to produce a 3D plot of the variables Latitude, Longitude, Depth, and an environmental parameter (e.g. Salinity/Temperature etc.) from multiple CTD transects. What I am trying to do looks like this:
I have tried using plotly with the plot_ly function, which is fairly close to what I want:
plot_ly(x=ctd$x, y=ctd$y, z=ctd$z) %>% add_markers(color = ctd$a)
However, I can't work out how to add interpolated data to plotly. I've previously used mba.surfwhen only wanting to plot Latitude, Depth, and e.g. Salinity but from what I understand mba.surf only accepts 3 variables.
I hope that makes sense. Thanks in advance!

3D plot from model in R plotly?

Is it possible to generate a 3D plot from models using plotly? I tried to search over the internet, but many examples are based on the infamous volcano dataset that generates a plot from a matrix of points.
My two models are:
y = 0.49867x - 4.78577
y = 76.13084x + 4.81945
If not possible, how can i transform my data into the matrix format such as that in the volcano dataset? For more details, I have hosted the data file here. I have never used plotly before and i'm unfamiliar with the grammar, but i think i can manage if i can at least format my data into the likes of the volcano dataset.
Thank you.
To plot a surface with plotly, you need to construct a numeric matrix.
Taking Himmelblau's function as a test:
f <- function(x, y) { (x^2+y-11)^2 + (x+y^2-7)^2 }
Create x and y values:
x <- seq(-6, 6, length = 100)
y <- x
Then, create z with outer function. It will return a matrix.
z <- outer(x, y, f)
We can now create a surface plot:
library(plotly)
plot_ly(x = x, y = y, z = ~z) %>% add_surface()

How to Plot Bar Charts for a Categorical Variable Against an Analytical Variable in R

I'm struggling with how to do something with R that comes very easily to me in Excel: so I'm sure this is something quite basic but I'm just not aware of the equivalent method in R.
In essence, I have a two variables in my dataset: a categorical variable which has a list of names, and an analytical variable that has the frequency corresponding to that particular observation.
Something like this:
Name Freq
==== =========
X 100
Y 200
and so on.
I would like to plot a bar chart with the names listed on the X-Axis (X, Y and so on) and bars of height corresponding to the relevant value of the Freq. variable for that observation.
This is something very trivial with Excel; I can just select the relevant cells and create a bar chart.
However, in R I just can't seem to figure out how to do this! The bar charts in R seems to be univariate only and doesn't behave the way I want it to. Trying to plot the two variables results in a scatter plot which is not what I'm going for.
Is there something very basic I'm missing here, or is R just not capable of performing this task?
Any pointers will be much helpful.
Edited to Add:
I was primarily trying to use base R's plot function to get the job done.
Using, plot(dataset1$Name, dataset1$Freq) does not lead to a bar graph but a scatter-plot instead.
First the data.
dat <- data.frame(Name = c("X", "Y"), Freq = c(100, 200))
With base R.
barplot(dat$Freq, names.arg = dat$Name)
If you want to display a long list of names.arg, maybe the best way is to customize your horizontal axis with function staxlab from package plotrix. Here are two example plots.
One, with the axis labels rotated 45 degrees.
set.seed(3)
Name <- paste0("Name_", LETTERS[1:10])
dat2 <- data.frame(Name = Name, Freq = sample(100:200, 10))
bp <- barplot(dat2$Freq)
plotrix::staxlab(1, at = bp, labels = dat2$Name, srt = 45)
Another, with the labels spread over 3 lines.
bp <- barplot(dat2$Freq)
plotrix::staxlab(1, at = bp, labels = dat2$Name, nlines = 3)
Add colors with argument col. See help("par").
With ggplot2.
library(ggplot2)
ggplot(dat, aes(Name, Freq)) +
geom_bar(stat = "identity")
To add colors you have the aesthetics colour (for the contour of the bars) and fill (for the interior of the bars).

How do I plot species as different colours in a point pattern (ppp) using spatstat in R?

The set up is this: There are 10 trees within a 20 by 20 m quadrat in a forest. For each tree we know the species, the diameter (in cm), and the location within the quadrat using x,y coordinates.
I would like to plot the trees within the quadrat, where the size of the points are to scale, and each species is represented by a different colour circle.
Use this data for an example:
tag <- as.character(c(1,2,3,4,5,6,7,8,9,10))
species <- c("A","A","A","A","B","B","B","C","C","D")
diameter <- c(50,20,55,30,30,45,15,20,35,45)
x <- c(9,4,5,14,8,19,9,12,10,2)
y <- c(6,7,15,16,12,4,19,2,14,9)
df <- data.frame(tag, species, diameter, x, y)
First I create the point pattern
species_map <- ppp(df$x, df$y, c(0,20), c(0,20))
Then I mark the species and diameter
marks(species_map) <- data.frame(m1 = df$species, m2=(df$diameter))
Now I can plot the point pattern and each point is to scale thanks to the marks on the diameter.
The "markscale" bit is set to 0.01 because the diamter measurements are in cm and the quadrat size is defined in meters.
plot(species_map, which.marks=2, markscale=.01)
Now I want to make the circles of different species different colours, but this is where I'm stuck.
If I try to make a plot that includes both of my marks I just get 2 separate plots, with one using different size points to represent diameter (correctly) and one using different characters to represent different species.
plot(species_map, which.marks= c(1,2), markscale=.01)
How can I get this plot to represent different species using different colors of the same character while ALSO plotting the points to scale?
And how can I make it produce 1 single plot?
Thank you in advance.
Jay
Strangely enough I can't think of a really elegant way to do this. My
best bet is to split the data into separate point patterns by species
and loop through the species and plot. Is that enough for you?
library(spatstat)
tag <- as.character(c(1,2,3,4,5,6,7,8,9,10))
species <- c("A","A","A","A","B","B","B","C","C","D")
diameter <- c(50,20,55,30,30,45,15,20,35,45)
x <- c(9,4,5,14,8,19,9,12,10,2)
y <- c(6,7,15,16,12,4,19,2,14,9)
df <- data.frame(tag, species, diameter, x, y)
species_map <- ppp(df$x, df$y, c(0,20), c(0,20))
marks(species_map) <- data.frame(m1 = df$species, m2=(df$diameter))
You need to choose four colours and fix the same range of diameters in
each plot and the do the loop (argumet bg is passed to symbols and
fills the background of the circles with this colour):
diamrange <- range(diameter)
cols <- c("black", "red", "green", "blue")
species_map_split <- split(species_map, reduce = TRUE)
plot(species_map_split[[1]], markrange = diamrange, markscale=.01,
main = "", cols = cols[1], bg = cols[1])
#> Warning: Interpretation of arguments maxsize and markscale has changed (in
#> spatstat version 1.37-0 and later). Size of a circle is now measured by its
#> diameter.
for(i in 2:4){
plot(species_map_split[[i]], markrange = diamrange, markscale=.01,
add = TRUE, col = cols[i], bg = cols[i])
}
Symbol maps for multiple columns of marks are not yet implemented in spatstat. So you'll need to do something like Ege suggests.
species <- c("A","A","A","A","B","B","B","C","C","D")
diameter <- c(50,20,55,30,30,45,15,20,35,45)
x <- c(9,4,5,14,8,19,9,12,10,2)
y <- c(6,7,15,16,12,4,19,2,14,9)
library(spatstat)
Dat <- data.frame(x,y,species, diameter)
X <- as.ppp(Dat,W=square(20))
marks(X)$species <- factor(marks(X)$species)
ccc <- c("red","green","blue","black")[as.numeric(marks(X)$species)]
plot(X,which.marks="diameter",maxsize=1,main="Elegant?")
plot(X,which.marks="diameter",maxsize=1,bg=ccc,add=TRUE)
#thanks to #RolfTurner for this!

Plot multiple traces in R

I started learning R for data analysis and, most importantly, for data visualisation.
Since I am still in the switching process, I am trying to reproduce the activities I was doing with Graphpad Prism or Origin Pro in R. In most of the cases everything was smooth, but I could not find a smart solution for plotting multiple y columns in a single graph.
What I usually get from the softwares I use for data visualisations look like this:
Each single black trace is a measurement, and I would like to obtain the same plot in R. In Prism or Origin, this will take a single copy-paste in a XY graph.
I exported the matrix of data (one X, which indicates the time, and multiple Y values, which are the traces you see in the image).
I imported my data in R with the following commands:
library(ggplot2) #loaded ggplot2
Data <- read.csv("Directory/File.txt", header=F, sep="") #imported data
DF <- data.frame(Data) #transformed data into data frame
If I plot my data now, I obtain a series of columns, where the first one (called V1) is the X axis and all the others (V2 to V140) are the traces I want to put on the same graph.
To plot the data, I tried different solutions:
ggplot(data=DF, aes(x=DF$V1, y=DF[V2:V140]))+geom_line()+theme_bw() #did not work
plot(DF, xy.coords(x=DF$V1, y=DF$V2:V140)) #gives me an error
plot(DF, xy.coords(x=V1, y=c(V2:V10))) #gives me an error
I tried the matplot, without success, following the EZH guide:
The code I used is the following: matplot(x=DF$V1, type="l", lty = 2:100)
The only solution I found would be to individually plot a command for each single column, but it is a crazy solution. The number of columns varies among my data, and manually enter commands for 140 columns is insane.
What would you suggest?
Thank you in advance.
Here there are also some data attached.Data: single X, multiple Y
I tried using the matplot(). I used a very sample data which has no trend at all. so th eoutput from my code shall look terrible, but my main focus is on the code. Since you have already tried matplot() ,just recheck with below solution if you had done it right!
set.seed(100)
df = matrix(sample(1:685765,50000,replace = T),ncol = 100)
colnames(df)=c("x",paste0("y", 1:99))
dt=as.data.frame(df)
matplot(dt[["x"]], y = dt[,c(paste0("y",1:99))], type = "l")
If you want to plot in base R, you have to make a plot and add lines one at a time, however that isn't hard to do.
we start by making some sample data. Since the data in the link seemed to all be on the same scale, I will assume your data frame only has y values and the x value is stored separately.
plotData <- as.data.frame(matrix(sort(rnorm(500)),ncol = 5))
xval <- sort(sample(200, 100))
Now we can initialize a plot with the first column.
plot(xval, plotData[[1]], type = "l",
ylim = c(min(plotData), max(plotData)))
type = "l" makes a line plot instead of a scatter plot
ylim = c(min(plotData), max(plotData)) makes sure the y-axis will fit all the data.
Now we can add the rest of the values.
apply(plotData[-1], 2, lines, x = xval)
plotData[-1] removes the column we already plotted,
apply function with 2 as the second parameter means we want to execute a function on every column,
lines defines the function we are applying to the columns. lines adds a new line to the current plot.
x = xval passes an extra parameter (x) to the lines function.
if you wat to plot the data using ggplot2, the data should be transformed to long format;
library(ggplot2)
library(reshape2)
dat <- read.delim('AP.txt', header = F)
# plotting only first 9 traces
# my rstudio will crach if I plot the full data;
df <- melt(dat[1:10], id.vars = 'V1')
ggplot(df, aes(x = V1, y = value, color = variable)) + geom_line()
# if you want all traces to be in same colour, you can use
ggplot(df, aes(x = V1, y = value, group = variable)) + geom_line()

Resources