non-linear 2d object transformation by horizontal axis - r

How can such a non-linear transformation be done?
here is the code to draw it
my.sin <- function(ve,a,f,p) a*sin(f*ve+p)
s1 <- my.sin(1:100, 15, 0.1, 0.5)
s2 <- my.sin(1:100, 21, 0.2, 1)
s <- s1+s2+10+1:100
par(mfrow=c(1,2),mar=rep(2,4))
plot(s,t="l",main = "input") ; abline(h=seq(10,120,by = 5),col=8)
plot(s*7,t="l",main = "output")
abline(h=cumsum(s)/10*2,col=8)
don't look at the vector, don't look at the values, only look at the horizontal grid, only the grid matters
####UPDATE####
I see that my question is not clear to many people, I apologize for that...
Here are examples of transformations only along the vertical axis, maybe now it will be more clear to you what I want
link Source
#### UPDATE 2 ####
Thanks for your answer, this looks like what I need, but I have a few more questions if I may.
To clarify, I want to explain why I need this, I want to compare vectors with each other that are non-linearly distorted along the horizontal axis .. Maybe there are already ready-made tools for this?
You mentioned that there are many ways to do such non-linear transformations, can you name a few of the best ones in my case?
how to make the function f() more non-linear, so that it consists, for example, not of one sinusoid, but of 10 or more. Тhe figure shows that the distortion is quite simple, it corresponds to one sinusoid
and how to make the function f can be changed with different combinations of sinusoids.
set.seed(126)
par(mar = rep(2, 4),mfrow=c(1,3))
s <- cumsum(rnorm(100))
r <- range(s)
gridlines <- seq(r[1]*2, r[2]*2, by = 0.2)
plot(s, t = "l", main = "input")
abline(h = gridlines, col = 8)
f <- function(x) 2 * sin(x)/2 + x
plot(s, t = "l", main = "input+new greed")
abline(h = f(gridlines), col = 8)
plot(f(s), t = "l", main = "output")
abline(h = f(gridlines), col = 8)

If I understand you correctly, you wish to map the vector s from the regular spacing defined in the first image to the irregular spacing implied by the second plot.
Unfortunately, your mapping is not well-defined, since there is no clear correspondence between the horizontal lines in the first image and the second image. There are in fact an infinite number of ways to map the first space to the second.
We can alter your example a bit to make it a bit more rigorous.
If we start with your function and your data:
my.sin <- function(ve, a, f, p) a * sin(f * ve + p)
s1 <- my.sin(1:100, 15, 0.1, 0.5)
s2 <- my.sin(1:100, 21, 0.2, 1)
s <- s1 + s2 + 10 + 1:100
Let us also create a vector of gridlines that we will draw on the first plot:
gridlines <- seq(10, 120, by = 2.5)
Now we can recreate your first plot:
par(mar = rep(2, 4))
plot(s, t = "l", main = "input")
abline(h = gridlines, col = 8)
Now, suppose we have a function that maps our y axis values to a different value:
f <- function(x) 2 * sin(x/5) + x
If we apply this to our gridlines, we have something similar to your second image:
plot(s, t = "l", main = "input")
abline(h = f(gridlines), col = 8)
Now, what we want to do here is effectively transform our curve so that it is stretched or compressed in such a way that it crosses the gridlines at the same points as the gridlines in the original image. To do this, we simply apply our mapping function to s. We can check the correspondence to the original gridlines by plotting our new curves with a transformed axis :
plot(f(s), t = "l", main = "output", yaxt = "n")
axis(2, at = f(20 * 1:6), labels = 20 * 1:6)
abline(h = f(gridlines), col = 8)
It may be possible to create a mapping function using the cumsum(s)/10 * 2 that you have in your original example, but it is not clear how you want this to correspond to the original y axis values.
Response to edits
It's not clear what you mean by comparing two vectors. If one is a non-linear deformation of the other, then presumably you want to find the underlying function that produces the deformation. It is possible to create a function that applies the deformation empirically simply by doing f <- approxfun(untransformed_vector, transformed_vector).
I didn't say there were many ways of doing non-linear transformations. What I meant is that in your original example, there is no correspondence between the grid lines in the original picture and the second picture, so there is an infinite choice for which gridines in the first picture correspond to which gridlines in the second picture. There is therefore an infinite choice of mapping functions that could be specified.
The function f can be as complicated as you like, but in this scenario it should at least be everywhere non-decreasing, such that any value of the function's output can be mapped back to a single value of its input. For example, function(x) x + sin(x)/4 + cos(3*(x + 2))/5 would be a complex but ever-increasing sinusoidal function.

Related

R: PCA plot with different colors for Sites

I´m recently trying to analyse my data and want to make the graphs a little nicer but I´m failing at this.
So I have a data set with 144 sites and 5 environmental variables. It´s basically about the substrate composition around an island and the fish abundance. On this island there is supposed to be a difference in the substrate composition between the north and the southside. Right now I am doing a pca and with the biplot function it works quite fine, but I would like to change the plot a bit.
I need one where the sites are just points and not numbered, arrows point to the different variable and the sites are colored according to their location (north or southside). So I tried everything i could find.
Most examples where with the dune data and suggested something like this:
library(vegan)
library(biplot)
data(dune)
mod <- rda(dune, scale = TRUE)
biplot(mod, scaling = 3, type = c("text", "points"))
So according to this I would just need to say text and points and R would label the variables and just make points for the sites. When i do this, however I get the Error:
Error in plot.default(x, type = "n", xlim = xlim, ylim = ylim, col = col[1L], :
formal argument "type" matched by multiple actual arguments
No idea how to get around this.
So next strategy I found, is to make a plot manually like this:
require("vegan")
data(dune, dune.env)
mod <- rda(dune, scale = TRUE)
scl <- 3 ## scaling == 3
colvec <- c("red2", "green4", "mediumblue")
plot(mod, type = "n", scaling = scl)
with(dune.env, points(mod, display = "sites", col = colvec[Use],
scaling = scl, pch = 21, bg = colvec[Use]))
text(mod,display="species", scaling = scl, cex = 0.8, col = "darkcyan")
with(dune.env, legend("bottomright", legend = levels(Use), bty = "n",
col = colvec, pch = 21, pt.bg = colvec))
This works fine so far as well, I get different colors and points, but now the arrows are missing. So I found that this should be corrected easy, if i just put "display="bp"" in the text line. But this doesn´t work either. Everytime I put "bp" R says:
Error in match.arg(display) :
argument "display" is missing, with no default
So I´m kind of desperate now. I looked through all the answers here and I don´t understand why display="bp" and type=c("text","points") is not working for me.
If anyone has an idea i would be super grateful.
https://www.dropbox.com/sh/y8xzq0bs6mus727/AADmasrXxUp6JTTHN5Gr9eufa?dl=0
This is the link to my dropbox folder. It contains my R-script and the csv files. The one named environmentalvariables_Kon1 also contains the data about north and southside.
So yeah...if anyone could help me. That would be awesome. I really don´t know what to do anymore.
Best regards,
Nancy
You can add arrows with arrows(). See the code for vegan:::biplot.rda to see how it works in the original function.
With your plot, add
g <- scores(mod, display = "species")
len <- 1
arrows(0, 0, len * g[, 1], len * g[, 2], length = 0.05, col = "darkcyan")
You might want to adjust the value of len to make the arrows longer

Dot plots (as opposed to dotplots) in R

I am trying to write a function that will produce what I regard as a real dot plot (unlike the Cleveland variety, I require a univariate scatterplot with the dots stacked for (nearly) equal values). I have come close:
In this illustration, the dots you see are actually rotated text strings of lower-case "o"s. It is done this way because I need the dot spacing to stay constant if the plot is re-scaled. However, I'd like something better than lower-case "o"s, for example, filled dots instead of circles. This could be done if I could access the font that is used for the standard plotting symbols (pch = 1:25 in the plot function and relatives). Then I could make a text string with that font and get what's needed. Does anybody know how to do that?
PS - No, a histogram with lots of bins is not an acceptable substitute.
I did find a way to get the desired dot plot using low-level graphics parameters (namely "usr", the actual user coordinates of the plotting area, and "cxy", the character size). The recordGraphics() function wraps the part that needs to be changed when the graph is resized. Here's the function:
dot.plot = function(x, pch = 16, bins = 50, spacing = 1, xlab, ...) {
if(missing(xlab))
xlab = as.character(substitute(x))
# determine dot positions
inc = diff(pretty(x, n = bins)[1:2])
freq = table(inc * round(x / inc, 0))
xx = rep(as.numeric(names(freq)), freq)
yy = unlist(lapply(freq, seq_len))
# make the order of the dots the same as the order of the data
idx = seq_along(x)
idx[order(x)] = idx
xx = xx[idx]
yy = yy[idx]
# make a blank plot
plot(xx, yy, type = "n", axes = FALSE, xlab = xlab, ylab = "")
# draw scale
axis(1)
ylow = par("usr")[3]
abline(h = ylow) # extend to full width
# draw points and support resizing
recordGraphics({
yinc = 0.5 * spacing * par("cxy")[2]
points(xx, ylow + yinc * (yy - .5), pch = pch, ...)
},
list(),
environment(NULL))
invisible()
}
The spacing argument may be used if you want a tighter or looser gap between dots. An example...
with(iris, dot.plot(Sepal.Length, col = as.numeric(Species)))
This is a better solution than trying to do it with text, but also a little bit scary because of the warnings you see in the documentation for recordGraphics

Easiest way to plot inequalities with hatched fill?

Refer to the above plot. I have drawn the equations in excel and then shaded by hand. You can see it is not very neat. You can see there are six zones, each bounded by two or more equations. What is the easiest way to draw inequalities and shade the regions using hatched patterns ?
To build up on #agstudy's answer, here's a quick-and-dirty way to represent inequalities in R:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
with(as.list(c(b,a)),{
id <- yB<=yA
# b<a area
polygon(x = c(xB[id], rev(xA[id])),
y = c(yB[id], rev(yA[id])),
density=10, angle=0, border=NULL)
# a>b area
polygon(x = c(xB[!id], rev(xA[!id])),
y = c(yB[!id], rev(yA[!id])),
density=10, angle=90, border=NULL)
})
If the area in question is surrounded by more than 2 equations, just add more conditions:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
d <- curve(0.5*x^2+0.2, add = TRUE) # Third curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
names(d) <- c('xD','yD')
with(as.list(c(a,b,d)),{
# Basically you have three conditions:
# curve a is below curve b, curve b is below curve d and curve d is above curve a
# assign to each curve coordinates the two conditions that concerns it.
idA <- yA<=yD & yA<=yB
idB <- yB>=yA & yB<=yD
idD <- yD<=yB & yD>=yA
polygon(x = c(xB[idB], xD[idD], rev(xA[idA])),
y = c(yB[idB], yD[idD], rev(yA[idA])),
density=10, angle=0, border=NULL)
})
In R, there is only limited support for fill patterns and they can only be
applied to rectangles and polygons.This is and only within the traditional graphics, no ggplot2 or lattice.
It is possible to fill a rectangle or polygon with a set of lines drawn
at a certain angle, with a specific separation between the lines. A density
argument controls the separation between the lines (in terms of lines per inch)
and an angle argument controls the angle of the lines.
here an example from the help:
plot(c(1, 9), 1:2, type = "n")
polygon(1:9, c(2,1,2,1,NA,2,1,2,1),
density = c(10, 20), angle = c(-45, 45))
EDIT
Another option is to use alpha blending to differentiate between regions. Here using #plannapus example and gridBase package to superpose polygons, you can do something like this :
library(gridBase)
vps <- baseViewports()
pushViewport(vps$figure,vps$plot)
with(as.list(c(a,b,d)),{
grid.polygon(x = xA, y = yA,gp =gpar(fill='red',lty=1,alpha=0.2))
grid.polygon(x = xB, y = yB,gp =gpar(fill='green',lty=2,alpha=0.2))
grid.polygon(x = xD, y = yD,gp =gpar(fill='blue',lty=3,alpha=0.2))
}
)
upViewport(2)
There are several submissions on the MATLAB Central File Exchange that will produce hatched plots in various ways for you.
I think a tool that will come handy for you here is gnuplot.
Take a look at the following demos:
feelbetween
statistics
some tricks

Indexing Issues

I've been trying to plot the difference between two sets of information (the residuals). Both sets of data have similar (yet different) characteristics, and both data sets go from 0 to the same X value. The only inconsistency is that they are indexed differently, so while the first graph reaches X in A steps, the second reaches X in B steps. Thus, I cannot simply subtract the dependent variable values of one data frame from the other. I am speaking in very general terms, so I've provided a simple example. I want to plot the residuals between two data sets that look like this:
data1 <- data.frame(x1=c(1,2,3,4,5,6), y1=c(10,5,7,3,2,4))
data2 <- data.frame(x2=c(1,3,6), y2=c(1,3,2))
plot(data1, y1 ~ x1, type = 'l', lty = 1, col = 'blue', xlim = c(1,6), ylim = c(0,10))
points(data2$y2 ~ data2$x2, type = 'l', lty = 1, col = 'red')
So I guess my question is:
How can I plot the residuals of two functions (like the above) that are indexed differently. Is there a function that will solve for the residuals between the two data sets?
EDIT1: The example was faulty, Spacedman helped me to rectify this.
If a linear interpolation is good enough, you can use approx to interpolate at a bunch of X coordinates. EG:
> xout = sort(unique(c(seq(1,6,len=100),data1$x1,data2$x2))) # include data coords (untested)
> d1 = approx(data1$x1,data1$y1,xout)
> d2 = approx(data2$x2,data2$y2,xout)
> plot(xout,d1$y-d2$y,type="l")

Correlation Scatter-matrix plot with different point size (in R)

I just came a cross this nice code that makes this scatter matrix plot:
(source: free.fr)
And wanted to implement it to a likret scale variables (integers of 1 to 5) by making the dot's sizes/colors (in the lower triangle) differ according to how many options of that type occurs (like the effect the jitter might have given me).
Any idea on how to do this on the base plotting mechanism ?
Update:
I made the following function, but don't know how to have the scale of the dots always be "good", what do you think ?
panel.smooth2 <- function (x, y, col = par("col"), bg = NA, pch = par("pch"),
cex = 1, col.smooth = "red", span = 2/3, iter = 3, ...)
{
require(reshape)
z <- merge(data.frame(x,y), melt(table(x ,y)),sort =F)$value
z <- z/ (4*max(z))
symbols( x, y, circles = z,#rep(0.1, length(x)), #sample(1:2, length(x), replace = T) ,
inches=F, bg="blue", fg = bg, add = T)
# points(x, y, pch = pch, col = col, bg = bg, cex = cex)
ok <- is.finite(x) & is.finite(y)
if (any(ok))
lines(stats::lowess(x[ok], y[ok], f = span, iter = iter),
col = col.smooth, ...)
}
a1 <- sample(1:5, 100, replace = T)
a2 <- sample(1:5, 100, replace = T)
a3 <- sample(1:5, 100, replace = T)
aa <- data.frame(a1,a2,a3)
pairs(aa , lower.panel=panel.smooth2)
You can use 'symbols' (analogous to the methods 'lines', 'abline' et al.)
This method will give you fine-grained control over both symbols size and color in a single line of code.
Using 'symbols' you can set the symbol size, color, and shape. Shape and size are set by passing in a vector for the size of each symbol and binding it to either 'circles', 'squares', 'rectangles', or 'stars', e.g., 'stars' = c(4, 3, 5, 1). Color is set with 'bg' and/or 'fg'.
symbols( x, y, circles = circle_radii, inches=1/3, bg="blue", fg=NULL)
If i understand the second part of your question, you want to be reasonably sure that the function you use to scale the symbols in your plot does so in a meaningful way. The 'symbols' function scales (for instance) the radii of circles based on values in a 'z' variable (or data.frame column, etc.) In the line below, I set the max symbol size (radius) as 1/3 inches--every symbol except for the largest has a radius some fraction smaller, scaled by the ratio of the value of that dat point over the largest value. than that one in proportion to Is this a good choice? I don't know--it seems to me that diameter or particularly circumference might be better. In any event, that's a trivial change. In sum, 'symbols' with 'circles' passed in will scale the radii of the symbols in proportion to the 'z' coordinate--probably best suited for continuous variables. I would use color ('bg') for discrete variables/factors.
One way to use 'symbols' is to call your plot function and pass in type='n' which creates the plot object but suppresses drawing the symbols so that you can draw them with the 'symbols' function next.
I would not recommend 'cex' for this purpose. 'cex' is a scaling factor for both text size and symbols size, but which of those two plot elements it affects depends on when you pass it in--if you set it via 'par' then it acts on most of the text appearing on the plot; if you set it within the 'plot' function then it affects symbols size.
Sure, just use cex:
set.seed(42)
DF <- data.frame(x=1:10, y=rnorm(10)*10, z=runif(10)*3)
with(DF, plot(x, y, cex=z))
which gives you varying circle sizes. Color can simply be a fourth dimension.

Resources