umap for dictionary in Julia - dictionary

I'm given a dictionary with keys(ids) and values.
> Dict{Int64, Vector{Float64}} with 122 entries:
3828 => [1, 2, 3, 4...
2672 => [6,7,5,8...
...
Now I need to apply umap on it. I have the code that
embedding = umap(mat, 2; n_neighbors=15, min_dist=0.001, n_epochs=200)
println(size(embedding))
Plots.scatter(embedding[1,:],embedding[2,:])
Here mat is the matrix
1, 2, 3, 4
6, 7, 5, 8
....
So I got the embedding matrix and the umap plot. But in the plot all points are same color and no labels. How do I do so that I can get points with labels(keys in the dictionary)?

Looking at UMAP.jl, the input matrix should have the shape (n_features x n_samples). If each entry in your dictionary is a sample and I’m interpreting your matrix notation correctly, it appears you have this reversed.
You should be able to add the keys of the dictionary as annotations to the plot as follows (potentially with an optional additional offset to each coordinate):
Plots.annotate!(
embedding[1,:] .+ x_offset,
embedding[2,:] .+ y_offset,
string.(collect(keys(yourdict)))
)
Finally, I’m not sure what variable you actually want to map to the color of the markers in the scatterplot. If it’s the integer value of the keys you should pass this to the scatter function just like above except without turning them into strings.

Related

Line plot with color gradient

Is there a way to create a plot in IDL with a color gradient to it? What I'm looking for is similar to this Matlab question. The best I know how to do is to plot each segment of the line in a for loop, but this seems rather cumbersome:
x = float(indgen(11) - 5)
y = x ^ 2
loadct, 2, /silent
!p.background = 255
plot, x, y
for i = 0, 9 do begin
oplot, x(i:i+1), y(i:i+1), color = i * 20, thick = 4
endfor
I'm using IDL 8.2 if that makes a difference.
I had the same issue once and there seems to be no (simple) solution. Though I surrendered, you can try using a RGB-vector and the VERT_COLORS-keywords, provided by the PLOT function:
A vector of indices into the color table for the color of each vertex
(plot data point). Alternately, a 3xN byte array containing vertex
color values. If the values supplied are not of type byte, they are
scaled to the byte range using BYTSCL. If indices are supplied but no
colors are provided with the RGB_TABLE property, a default grayscale
ramp is used. If a 3xN array of colors is provided, the colors are
used directly and the color values provided with RGB_TABLE are
ignored. If the number of indices or colors specified is less than the
number of vertices, the colors are repeated cyclically.
That would change the appearence more discrete, but maybe it will help you.
I have a routine MG_PLOTS which can do this in direct graphics:
IDL> plot, x, y, /nodata, color=0, background=255
IDL> mg_plots, x, y, color=indgen(10) * 20, thick=4
Of course, it is just a wrapper for what you where doing manually.

Move existing 2D points to newly generated 2D points in an optimized manner

I am trying to solve a problem where I have some existing points (P) that need to move to new location that is generated by some method, say (P`). I want to know if there is a optimization algorithm that finds the best mapping of points.
I tried to map by using the least distance between points choosing the best in a loop but the last ones ended up with worst deal. How can we determine the best mapping?
we are not trying for best time or space complexity since we only have handful of points to work with. Following is what we have till now.
getMapping <- function(originalX, originalY, newX, newY)
{
#Maps original index to new index
dimemsion <- length(originalX)
#this matrix will hold distance of each original point from each of the new points
dist.matrix <- matrix(nrow = dimemsion, ncol= dimemsion)
#this is a brute force method
for(i in 1:dimemsion) # i traverses over original data points
{
for(j in 1:dimemsion) # j traverses over new data points
{
distance <- sqrt((originalY[i] - newY[j])^2 + (originalX[i] - newX[j])^2)
dist.matrix[i,j] = distance
}
}
#Best way to find mapping ?????
..... Not sure how to do it right now
return(dist.matrix)
}
#Use Case 1
originalX = c( 1, 2, 3, 4, 5, 6)
originalY = c( 1, 2, 3, 4, 5, 6)
newX = c( 1, 1, 3, 4, 5, 6)
newY = c( 1, 1, 4, 3, 2, 1)
print(getMapping(originalX, originalY, newX , newY))
How can I find best combination from the summationMatrix? Or any algorithm/idea to approach this issue will be appreciated. We are using R as the language here.
Thanks
First, you better use the dist function to produce summationMatrix (the name summationMatrix is, imho, horrible, I would name it something like dist.matrix or dist.mat).
Second, what you need is called Hungarian algorithm.

Maxima plot2d discrete with points

I have a samples list with a collection of (x,y) coordinates pairs. I want to use plot2d to create a discrete plot from these points, not showing lines connecting each point.
This is my script:
plot2d(
[discrete, samples],
[style, [points, 1, 5, 1]],
[legend, "Samples"],
[gnuplot_term, "svg size 640,480"],
[gnuplot_out_file, "graph_samples.svg"]
)$
But the result is a plot with connected points, as can be seen in the picture below. Despite the use of the [style, [points, 1, 5, 1]] option, the plot connects each point. The style definition seems to be ignored.
Does anyone have a clue why is this happening? I know I could alternatively use draw2d but I'd rather use plot2d if possible.
You can also quote a symbol to prevent evaluation:
points: [1, 2, 3];
x: 42;
plot2d('x^2, ['x, 1, 2], ['style, 'points]);
The problem was that I had a points matrix previously declared that was conflicting with the style definition. Changed its name and worked like a charm.

Color Plot by order of points in list - Mathematica

I've got a list of three dimensional points, ordered by time. Is there a way to plot the points so that I can get a visual representation that also includes information on where in the list the point occurred? My initial thought is to find a way to color the points by the order in which they were plotted.
ListPlot3D drapes a sheet over the points, with no regard to the order which they were plotted.
ListPointPlot just shows the points, but gives no indication as to the order in which they were plotted. It's here that I am thinking of coloring the points according to the order in which they appear in the list.
ListLinePlot doesn't seem to have a 3D cousin, unlike a lot of the other plotting functions.
You could also do something like
lst = RandomReal[{0, 3}, {20, 3}];
Graphics3D[{Thickness[0.005],
Line[lst,
VertexColors ->
Table[ColorData["BlueGreenYellow"][i], {i,
Rescale[Range[Length[lst]]]}]]}]
As you did not provide examples, I made up some by creating a 3d self-avoiding random walk:
Clear[saRW3d]
saRW3d[steps_]:=
Module[{visited},
visited[_]=False;
NestList[
(Function[{randMove},
If[
visited[#+randMove]==False,
visited[#+randMove]=True;
#+randMove,
#
]
][RandomChoice[{{1,0,0},{-1,0,0},{0,1,0},{0,-1,0},{0,0,1},{0,0,-1}}]])&,
{0,0,0},
steps
]//DeleteDuplicates
]
(this is sort of buggy but does the job; it produces a random walk in 3d which avoids itself, ie, avoids revisiting the same place in subsequent steps).
Then we produce 100000 steps like this
dat = saRW3d[100000];
this is like I understood your data points to be. We then make these change color depepnding on which step it is:
datpairs = Partition[dat, 2, 1];
len = Length#datpairs;
dressPoints[pts_, lspec_] := {RGBColor[(N#First#lspec)/len, 0, 0],
Line#pts};
datplt = MapIndexed[dressPoints, datpairs];
This can also be done all at once like the other answers
datplt=MapIndexed[
{RGBColor[(N#First##2)/Length#dat, 0, 0], Line##1} &,
Partition[dat, 2, 1]
]
but I tend to avoid this sort of constructions because I find them harder to read and modify.
Finally plot the result:
Graphics3D[datplt]
The path gets redder as time advances.
If this is the sort of thing you're after, I can elaborate.
EDIT: There might well be easier ways to do this...
EDIT2: Show a large set of points to demonstrate that this is very useful to see the qualitative trend in time in cases where arrows won't scale easily.
EDIT3: Added the one-liner version.
I think Heike's method is best, but she made it overly complex, IMHO. I would use:
Graphics3D[{
Thickness[0.005],
Line[lst,
VertexColors ->
ColorData["SolarColors"] /# Rescale#Range#Length#lst ]
}]
(acl's data)
Graphics3D#(Arrow /# Partition[RandomInteger[{0, 10}, {10, 3}], 2, 1])
As to your last question: If you want to have a kind of ListLinePlot3D instead of a ListPointPlot you could simply do the following:
pointList =
Table[{t, Sin[t] + 5 Sin[t/10], Cos[t] + 5 Cos[t/10],
t + Cos[t/10]}, {t, 0, 100, .5}];
ListPointPlot3D[pointList[[All, {2, 3, 4}]]] /. Point -> Line
Of course, in this way you can't set line properties so you have to change the rule a bit if you want that:
ListPointPlot3D[pointList[[All, {2, 3, 4}]]] /.
Point[a___] :> {Red, Thickness[0.02], Line[a]}
or with
ListPointPlot3D[pointList[[All, {2, 3, 4}]]] /.
Point[a___] :> {Red, Thickness[0.002], Line[a], Black, Point[a]}
But then, why don't you use just Graphics3D and a few graphics primitives?

Scipy - data interpolation from one irregular grid to another irregular spaced grid

I am struggling with the interpolation between two grids, and I couldn't find an appropriate solution for my problem.
I have 2 different 2D grids, of which the node points are defined by their X and Y coordinates. The grid itself is not rectangular, but forms more or less a parallelogram (so the X-coordinate for (i,j) is not the same as (i,j+1), and the Y coordinate of (i,j) is different from the Y coordinate of (i+1,j).
Both grids have a 37*5 shape and they overlap almost entirely.
For the first grid I have for each point the X-coordinate, the Y-coordinate and a pressure value. Now I would like to interpolate this pressure distribution of the first grid on the second grid (of which also X and Y are known for each point.
I tried different interpolation methods, but my end result was never correct due to the irregular distribution of my grid points.
Functions as interp2d or griddata require as input a 1D array, but if I do this, the interpolated solution is wrong (even if I interpolate the pressure values from the original grid again on the original grid, the new pressure values are miles away from the original values.
For 1D interpolation on different irregular grids I use:
def interpolate(X, Y, xNew):
if xNew<X[0]:
print 'Interp Warning :', xNew,'is under the interval [',X[0],',',X[-1],']'
yNew = Y[0]
elif xNew>X[-1]:
print 'Interp Warning :', xNew,'is above the interval [',X[0],',',X[-1],']'
yNew = Y[-1]
elif xNew == X[-1] : yNew = Y[-1]
else:
ind = numpy.argmax(numpy.bitwise_and(X[:-1]<=xNew,X[1:]>xNew))
yNew = Y[ind] + ((xNew-X[ind])/(X[ind+1]-X[ind]))*(Y[ind+1]-Y[ind])
return yNew
but for 2D I thought griddata would be easier to use. Does anyone have experience with an interpolation where my input is a 2D array for the mesh and for the data?
Have another look at interp2d. http://docs.scipy.org/scipy/docs/scipy.interpolate.interpolate.interp2d/#scipy-interpolate-interp2d
Note the second example in the 'x,y' section under 'Parameters'. 'x' and 'y' are 1-D in a loose sense but they can be flattened arrays.
Should be something like this:
f = scipy.interpolate.interp2d([0.25, 0.5, 0.27, 0.58], [0.4, 0.8, 0.42,0.83], [3, 4, 5, 6])
znew = f(.25,.4)
print znew
[ 3.]
znew = f(.26,.41) # midway between (0.25,0.4,3) and (0.27,0.42,5)
print znew
[ 4.01945345] # Should be 4 - close enough?
I would have thought you could pass flattened 'xnew' and 'ynew' arrays to 'f()' but I couldn't get that to work. The 'f()' function would accept the row, column syntax though, which isn't useful to you. Because of this limitation with 'f()' you will have to evaluate 'znew' as part of a loop - might should look at nditer for that. Make sure also that it does what you want when '(xnew,ynew)' is outside of the '(x,y)' domain.

Resources