Plot along different dimensions - julia

I have the following basic code. The first line sums p along dimension 1 to create a 1 x column array. The next line plot A. Unfortunately, it seems that Julia assumes it must plot many lines (in this case just points) along dimension 2.
A = sum(p,dims = 1)
plot(A)
So, my question is, how can I plot a simple line when the data is in a 1 x column array?

I assume you use Plots.jl. The following is from Plots.jl's documentation.
If the argument [to plot] is a "matrix-type", then each column will map to a series, cycling through columns if there are fewer columns than series. In this sense, a vector is treated just like an "nx1 matrix".
The number of series plot(a) tries to plot is the number of columns in a.
To get a single series, you can do one of the followings
plot(vec(a)) # `vec` will give you a vector view of `a` without an allocation
plot(a') # or `plot(transpose(a))`. `transpose` does not allocate a new array
plot(a[:]) # this allocates a new array so you should probably avoid it

Related

plotting a series of coordinates stored in a 2D array

So let's say I define the following array in Julia:
M=[[1,1],[2,4],[3,9],[4,16],[5,25],[6,36],[7,49],[8,64],[9,81],[10,100],[11,121],[12,144]]
Clearly each element [x,y] follows the quadratic rule $y=x^2$ and so I expect to get a parabolic shape when I plot it by using the command plot(M).
But instead I'm getting something like this:
[][1
What am I doing wrong, and what should I do to get my desired result -- a parabolic shape?
From the docs for Plots.jl:
The plot function has several methods:
plot(y): treats the input as values for the y-axis and yields a unit-range as x-values.
i.e. when you pass a single argument to plot, the values in the argument get interpreted as y-axis values, with the x-axis being 1, 2, 3, ....
Here, because M is a vector of vectors, a line plot is created for each of the inner vectors. For example, [3, 9] results in a line plot from (1, 3) to (1, 9).
To plot the parabola, in this case, you can do:
plot(first.(M), last.(M))
which will extract each first element of the inner array to form the x-axis, and each second element for the y-axis.
Of course, it's better to just create them as separate vectors in the first place, if you don't require M to be a vector of vectors for some other reason.
In case M is changed into a Matrix instead (which is the recommended way to create 2D arrays in Julia), for eg.
julia> M
12×2 Matrix{Int64}:
1 1
2 4
3 9
etc.
then you can plot it with
julia> #views plot(M[:, 1], M[:, 2])
M[:, 1] gets all values on the first column (the x-axis), M[:, 2] the same on the second column (y-axis), and the #views at the beginning avoids these being allocated a new memory area unnecessarily, instead being read and used directly from M itself.
Interestingly, since Plots handles an array of Tuples as an array of (x, y) points, this works:
plot(Tuple.(M))

filtering data within a correlated matrix

I have an data.frame, compare the X and Y axes, and then I get N results, and I need to generate a graphic, be ggcorplot or correlationplot, but I wanted to make a filter for the chart, where only values above 0 will be included in the graph
I have already tried ,
dataCorrealtion[dataCorrelation > 0] <- ""
but the graph doesn't accept an empty value, and I can't put a fake value
I hope to be able to generate a graph without values that are less than 0

Averaging different length vectors with same domain range in R

I have a dataset that looks like the one shown in the code.
What I am guaranteed is that the "(var)x" (domain) of the variable is always between 0 and 1. The "(var)y" (co-domain) can vary but is also bounded, but within a larger range.
I am trying to get an average over the "(var)x" but over the different variables.
I would like some kind of selective averaging, not sure how to do this in R.
ax=c(0.11,0.22,0.33,0.44,0.55,0.68,0.89)
ay=c(0.2,0.4,0.5,0.42,0.5,0.43,0.6)
bx=c(0.14,0.23,0.46,0.51,0.78,0.91)
by=c(0.1,0.2,0.52,0.46,0.4,0.41)
qx=c(0.12,0.27,0.36,0.48,0.51,0.76,0.79,0.97)
qy=c(0.03,0.2,0.52,0.4,0.45,0.48,0.61,0.9)
a<-list(ax,ay)
b<-list(bx,by)
q<-list(qx,qy)
What I would like to have something like
avgd_x = c(0.12,0.27,0.36,0.48,0.51,0.76,0.79,0.97)
and
avgd_y would have contents that would
find the value of ay and by at 0.12 and find the mean with ay, by and qy.
Similarly and so forth for all the values in the vector with the largest number of elements.
How can I do this in R ?
P.S: This is a toy dataset, my dataset is spread over files and I am reading them with a custom function, but the raw data is available as shown in the code below.
Edit:
Some clarification:
avgd_y would have the length of the largest vector, for example, in the case above, avgd_y would be (ay'+by'+qy)/3 where ay' and by' would be vectors which have c(ay(qx(i))) and c(by(qx(i))) for i from 1 to length of qx, ay' and by' would have values interpolated at data points of qx

How to use pointDistance with a very large vector

I've got a big problem.
I've got a large raster (rows=180, columns=480, number of cells=86400)
At first I binarized it (so that there are only 1's and 0's) and then I labelled the clusters.(Cells that are 1 and connected to each other got the same label.)
Now I need to calculate all the distances between the cells, that are NOT 0.
There are quiet a lot and that's my big problem.
I did this to get the coordinates of the cells I'm interested in (get the positions (i.e. cell numbers) of the cells, that are not 0):
V=getValues(label)
Vu=c(1:max(V))
pos=which(V %in% Vu)
XY=xyFromCell(label,pos)
This works very well. So XY is a matrix, which contains all the coordinates (of cells that are not 0). But now I'm struggling. I need to calculate the distances between ALL of these coordinates. Then I have to put each one of them in one of 43 bins of distances. It's kind of like this (just an example):
0<x<0.2 bin 1
0.2<x<0.4 bin2
When I use this:
pD=pointDistance(XY,lonlat=FALSE)
R says it's not possible to allocate vector of this size. It's getting too large.
Then I thought I could do this (create an empty data frame df or something like that and let the function pointDistance run over every single value of XY):
for (i in 1:nrow(XY))
{pD=PointDistance(XY,XY[i,],lonlat=FALSE)
pDbin=as.matrix(table(cut(pD,breaks=seq(0,8.6,by=0.2),Labels=1:43)))
df=cbind(df,pDbin)
df=apply(df,1,FUN=function(x) sum(x))}
It is working when I try this with e.g. the first 50 values of XY.
But when I use that for the whole XY matrix it's taking too much time.(Sometimes this XY matrix contains 10000 xy-coordinates)
Does anyone have an idea how to do it faster?
I don't know if this will works fast or not. I recommend you try this:
Let say you have dataframe with value 0 or 1 in each cell. To find coordinates all you have to do is write the below code:
cord_matrix <- which(dataframe == 1, arr.ind = TRUE)
Now, you get the coordinate matrix with row index and column index.
To find the euclidean distance use dist() function. Go through it. It will look like this:
dist_vector <- dist(cord_matrix)
It will return lower triangular matrix. can be transformed into vector/symmetric matrix. Now all you have to do is calculating bins according to your requirement.
Let me know if this works within the specific memory space.

Drawing a Square Line Chart using quantmod

Is there a way to get quantmod to draw a square line chart?
I've tried modifying my time series so that each data point is replicated one second before the next datapoint (hoping this would approximate a square line), but quantmod seems to data on the x axis sequentially & evenly spaces without regard to the actually values of x (i.e. the horizontal space between one point an the next is the same whether the delta-T is 1 second or 1 minute).
I suppose I could convert my timeseries from a sparse to a dense one (one entry per second instead of one entry per change in value), but this seems very kludgy and should be unnecessary.
I'm constructing my time series thus:
library(quantmod)
myNumericVector <- c(3,7,2,9,4)
myDateTimeStrings <- paste("2011-10-31", c("5:26:00", "5:26:10", "5:26:40", "5:26:50", "5:27:00"))
myXts <- xts(myNumericVector, order.by=as.POSIXct(myDateTimeStrings))
And drawing the chart like so:
chartSeries(myXts, type="line", show.grid="true", theme=chartTheme("black"))
To illustrate what I have vs. what I want, the result looks like the blue line below but I'd like something more like the green:
Also, for the curious, here is the code that replicates points in the time series such that the gap between one value and the next are as small as possible:
mySquareDateTimes <- rep(as.POSIXct(myDateTimeStrings),2)[-1]
mySquareDateTimes[seq(2,8,by=2)] <- mySquareDateTimes[seq(2,8,by=2)] - 1
mySquareXts <- xts(rep(myNumericVector,each=2)[-10], order.by=mySquareDateTimes)
chartSeries(mySquareXts, type="line", show.grid="true", theme=chartTheme("black"))
The results are less than ideal.
You want a line.type of "step":
chartSeries(myXts, line.type="s")
See ?plot, specifically "type" under ... in the Arguments section (you may want "S" instead of "s").

Resources