Let's consider this simple example of HexagonLayer with a large number of points:
import numpy as np
import pandas as pd
import pydeck as pdk
import streamlit as st
lat0=40.7
lon0=-74.12
n_points = 100000
lat = np.random.normal(loc=lat0, scale=0.02, size=n_points)
lon = np.random.normal(loc=lon0, scale=0.02, size=n_points)
data = pd.DataFrame({'lat': lat, 'lon': lon})
st.pydeck_chart(pdk.Deck(
map_provider="mapbox",
initial_view_state=pdk.ViewState(
latitude=lat0,
longitude=lon0,
zoom=10,
),
layers=[
pdk.Layer(
'HexagonLayer',
data=data,
get_position='[lon, lat]',
radius=1000,
coverage=0.6,
),
],
))
Here's the result:
AFAIU, HexagonLayer is a histogram of the data. Here, I have 100k points in the dataframe, but only o(100) non empty bins.
When moving the map or zooming on the map, things are painfully slow. It looks as if the histogramming is reran at every step (though with the same result, as opposed to other layers types like ScreenGridLayer or HeatmapLayer). Is there a way to cache the result of the hexagonal binning to speed things up? Or am I doing something wrong?
As a test, if I reduce the number of points, the map becomes way more usable (even if I increase the number of non empty bins a lot).
Related
I want to plot a 3D surface plot. My data structure is the following: N files of two-column x,y data (27 row information header), where x is always equally between all datasets, only y varies. In this case I want my y from the dataset to be my z-axis in the 3D plot. X would stay x and y should be an equally spaced number (e.g. 0.5) for every N dataset.
I tried to first concat all files into a large data frame with indexed columns, which works but I don't know how to proceed, or if there is a smarter way to tackle this.
import pandas as pd
import numpy as np
import glob
dataname = "test"
path = "iq/"
all_files = glob.glob("iq/*.iq")
iqdata = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=27, delim_whitespace=True, names=['Q', 'iq'])
iq = df['iq']
iqdata.append(iq)
frame = pd.concat(iqdata, axis=1, ignore_index=True)
df_x = pd.read_csv('iq/dataset_0001.iq', index_col=None, header=27, delim_whitespace=True, names=['Q', 'iq'])
frame = pd.concat([df_x, frame], axis=1, ignore_index=True)
frame.to_csv(path + dataname + ".csv", index=False)
Any ideas how to smartly improve this and get an 3D surface (e.g. with matplotlib) out of this?
I would like to plot data which is defined only for each unordered pair of distinct elements in a set. This is naturally represented by a lower-- or upper--triangular matrix, with no values on the diagonal. A correlation matrices is an example.
I might plot it like this
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
mat = np.random.randn(12,12)
# Generate a mask for the upper triangle
mask = np.triu(np.ones_like(mat, dtype=bool))
sns.heatmap(mat, mask=mask, cmap="vlag", square=True, linewidths=.5, cbar=None)
giving
But I would prefer to plot it like this
Or perhaps with the labels showing each once on the hypoteneuse/diagonal would be nicer, rather than repeated on the two sides like I have done.
Is there a natural way to do this with pyplot or seaborn (or R)? I'm sure there's a relatively simple hacky way, but I wonder if there's a package out there that already does something like this. It seems like a natural way to represent symmetric relation data.
I want to assess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
I have some points on a map.
I can draw a simple 400 m buffer around them.
I want to determine which buffers overlap and then count the number of overlaps.
This number of overlaps should relate back to the original point so I can see which point has the highest number of overlaps and therefore if I were to walk 400 m from that point I could determine how many other points I could get to.
I've asked this question in GIS overflow, but I'm not sure it's going to get answered for ArcGIS and I think I'd prefer to do the work in R.
This is what I'm aiming for
https://www.newham.gov.uk/Documents/Environment%20and%20planning/EB01.%20Evidence%20Base%20-%20Cumulative%20Impact%20V2.pdf
To simplify here's some code
# load packages
library(easypackages)
needed<-c("sf","raster","dplyr","spData","rgdal",
"tmap","leaflet","mapview","tmaptools","wesanderson","DataExplorer","readxl",
"sp" ,"rgisws","viridis","ggthemes","scales","tidyverse","lubridate","phecharts","stringr")
easypackages::libraries(needed)
## read in csv data; first column is assumed to be Easting and second Northing
polls<-st_as_sf(read.csv(url("https://www.caerphilly.gov.uk/CaerphillyDocs/FOI/Datasets_polling_stations_csv.aspx")),
coords = c("Easting","Northing"),crs = 27700)
polls_buffer_400<-st_buffer(plls,400)
polls_intersection<-st_intersection(x=polls_buffer_400,y=polls_buffer_400)
plot(polls_intersection$geometry)
That should show the overlapping buffers around the polling stations.
What I'd like to do is count the number of overlaps which is done here:
polls_intersection_grouped<-polls_intersection%>%group_by(Ballot.Box.Polling.Station)%>%count()
And this is the bit I'm not sure about, to get to the output I want (which will show "Hotspots" of polling stations in this case) how do I colour things? How can I :
asess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
It's probably terribly bad form but here's my original GIS question
https://gis.stackexchange.com/questions/328577/buffer-analysis-of-points-counting-intersects-of-resulting-polygons
Edit:
this gives the intersections different colours which is great.
plot(polls_intersection$geometry,col = sf.colors(categorical = TRUE, alpha = .5))
summary(lengths(st_intersects(polls_intersection)))
What am I colouring here? I mean it looks nice but I really don't know what I'm doing.
How can I : asess the degree of spatial proximity of each point to other equivalent points by looking at the number of others within 400m (5 minute walk).
Here is how to add a column to your initial sfc of pollings stations that tells you how many polling stations are within 400m of each feature in that sfc.
Note that the minimum value is 1 because a polling station is always within 400m of itself.
# n_neighbors shows how many polling stations are within 400m
polls %>%
mutate(n_neighbors = lengths(st_is_within_distance(polls, dist = 400)))
Similarly, for your sfc collection of intersecting polygons, you could add a column that counts the number of buffer polygons that contain each intersection polygon:
polls_intersection %>%
mutate(n_overlaps = lengths(st_within(geometry, polls_buffer_400)))
And this is the bit I'm not sure about, to get to the output I want (which will show "Hotspots" of polling stations in this case) how do I colour things?
If you want to plot these things I highly recommend using ggplot2. It makes it very clear how you associate an attribute like colour with a specific variable.
For example, here is an example mapping the alpha (transparency) of each polygon to a scaled version of the n_overlaps column:
library(ggplot2)
polls_intersection %>%
mutate(n_overlaps = lengths(st_covered_by(geometry, polls_buffer_400))) %>%
ggplot() +
geom_sf(aes(alpha = 0.2*n_overlaps), fill = "red")
Lastly, there should be a better way to generate your intersecting polygons that already counts overlaps. This is built in to the st_intersection function for finding intersections of sfc objects with themselves.
However, your data in particular generates an error when you try to do this:
st_intersection(polls_buffer_400)
# > Error in CPL_nary_intersection(x) :
#> Evaluation error: TopologyException: side location conflict at 315321.69159061194 199694.6971799387.
I don't know what a "side location conflict" is. Maybe #edzer could help with that. However, most subsets of your data do not contain that conflict. For example:
# this version adds an n.overlaps column automatically:
st_intersection(polls_buffer_400[1:10,]) %>%
ggplot() + geom_sf(aes(alpha = 0.2*n.overlaps), fill = "red")
I am looking for a specific online tool. At first it displays empty 2D plot (with gridlines from -10 to 10 for example). You can also choose a color. When I select a color and then click on the plot a new point should be drawn on the plot. I can click multiple times so that multiple points are generated on the plot. Then I can change the color and generate more points on the same plot (but with different color). When I'm done I should be able to export the points to list of coordinates and color: [(0, 1, 'blue'), (1, 1, 'green'), (1, 2, 'green')].
Does anyone know such tool? It's purpose is to simply quickly generate 2D dataset with multiple classes.
I wasn't able to find a tool that would exactly meet all your requirements but I think there is a solution that my fulfill some of them.
You can use plotly (https://plot.ly/create/) to plot visualize the points using scatter plot creator.
As for random points you can generate them randomly as well as assign colors to them using some simple python function, like this:
import pandas as pd
import numpy as np
import random
def make_points(minv,maxv,total):
df = pd.DataFrame(np.random.uniform(low=minv, high=maxv, size=(total,2)), columns=list('XY'))
arr=["blue", "green", "purple", "red"]
arr *= total // len(arr)
random.shuffle(arr)
df['color'] = arr
df.to_csv("points")
return df
make_points(-10,10,100)
This for example will create a dataframe with 100 2d points that can get values from -10, 10, and each is randomly assigned one of 4 colors.
Import the csv in the plotly chart creator and you can then manually edit the values if you like.
i tried to express the trajectory of bullet when there is a drag force.
however, i am not able to express the graph precisely.
how to depict trajectory from ode equation?.
this is my graph. this graph does not plausible. although i struggled setting different sign of vydot value, this is not working correctly.
from pylab import*
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import numpy as np
g=10
m=1
k=0.01
y=zeros([2])
vy0=0
vydot=200
vx0=0
vxdot=200
y[0]=vy0
y[1]=vydot
x=zeros([2])
x[0]=vx0
x[1]=vxdot
t=linspace(0,1000,5000)
def fy(y,t):
g0=y[1]
g1=-k*y[1]
return array([g0,g1])
def fx(z,t):
g0=-x[1]
g1=-k*(x[1])-g
return array([g0,g1])
ans1=odeint(fy,y,t)
ans2=odeint(fx,x,t)
ydata=(ans1[:,])
xdata=(ans2[:,])
plt.plot(ydata,xdata)
show()"""
In air, as opposed to liquids, the bullet not only displaces the volume along its path, but also increases the impulse of the displaced air molecules proportional to the velocity. Thus the drag force is
vn=sqrt(vx²+vy²)
dragx = -k*vn*vx
dragy = -k*vn*vy
Thus use
def f(z,t):
x,y,vx,vy = z
vn = sqrt(vx*vx+vy*vy)
return array([vx, vy, -k*vn*vx, -k*vn*vy-g ])
For a first overview, consider the problem without drag. Then the solution is
x(t) = vx*t = 200m/s*t
y(t) = vy*t-g/2*t² = 200m/s*t - 5m/s²*t²
y(t)=0 is again met for t=2*vy/g at the x coordinate 2*vx*vy/g = 8000m. Maximum height is reached for t=vy/g at height vy²/(2g)=2000m.