I want to plot a 3D surface plot. My data structure is the following: N files of two-column x,y data (27 row information header), where x is always equally between all datasets, only y varies. In this case I want my y from the dataset to be my z-axis in the 3D plot. X would stay x and y should be an equally spaced number (e.g. 0.5) for every N dataset.
I tried to first concat all files into a large data frame with indexed columns, which works but I don't know how to proceed, or if there is a smarter way to tackle this.
import pandas as pd
import numpy as np
import glob
dataname = "test"
path = "iq/"
all_files = glob.glob("iq/*.iq")
iqdata = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=27, delim_whitespace=True, names=['Q', 'iq'])
iq = df['iq']
iqdata.append(iq)
frame = pd.concat(iqdata, axis=1, ignore_index=True)
df_x = pd.read_csv('iq/dataset_0001.iq', index_col=None, header=27, delim_whitespace=True, names=['Q', 'iq'])
frame = pd.concat([df_x, frame], axis=1, ignore_index=True)
frame.to_csv(path + dataname + ".csv", index=False)
Any ideas how to smartly improve this and get an 3D surface (e.g. with matplotlib) out of this?
Related
Let's consider this simple example of HexagonLayer with a large number of points:
import numpy as np
import pandas as pd
import pydeck as pdk
import streamlit as st
lat0=40.7
lon0=-74.12
n_points = 100000
lat = np.random.normal(loc=lat0, scale=0.02, size=n_points)
lon = np.random.normal(loc=lon0, scale=0.02, size=n_points)
data = pd.DataFrame({'lat': lat, 'lon': lon})
st.pydeck_chart(pdk.Deck(
map_provider="mapbox",
initial_view_state=pdk.ViewState(
latitude=lat0,
longitude=lon0,
zoom=10,
),
layers=[
pdk.Layer(
'HexagonLayer',
data=data,
get_position='[lon, lat]',
radius=1000,
coverage=0.6,
),
],
))
Here's the result:
AFAIU, HexagonLayer is a histogram of the data. Here, I have 100k points in the dataframe, but only o(100) non empty bins.
When moving the map or zooming on the map, things are painfully slow. It looks as if the histogramming is reran at every step (though with the same result, as opposed to other layers types like ScreenGridLayer or HeatmapLayer). Is there a way to cache the result of the hexagonal binning to speed things up? Or am I doing something wrong?
As a test, if I reduce the number of points, the map becomes way more usable (even if I increase the number of non empty bins a lot).
I want to find the path that connects many points in 2D space (actually latitude, and longitude cords). These points are measured from a train (roughly every 10 seconds).
I found a method to "denoise" the points and reduce the total number of points. Here is an example of how the data looks before I denoise it.
The data points are not ordered along the path. I would like to do is sort the points along the path so that I can iterate over the points from start to finish.
I'm somewhat new to R. I have written a method to sort the points in C and used Rcpp to integrate my method into R. But I would like know how I can do this in R? I don't want to iterate over the points in R in a for loop. That will be too slow. I need something like sapply which does the looping internally with a compiled method in R.
Here is an example of the kind of data I have after I denoise (this data is not connect to the plot above).
0.000000 0.000000
0.999886 0.015104
1.994528 -0.088276
2.975603 -0.281902
3.945894 -0.523844
4.906713 -0.801021
5.893859 -0.960844
6.864580 -1.201053
7.859816 -1.298548
8.856026 -1.211567
9.851185 -1.113287
10.851147 -1.121947
11.844307 -1.238707
12.800410 -1.531737
13.741038 -1.871177
14.663443 -2.257401
15.641304 -2.466656
16.641061 -2.488718
17.638100 -2.565617
18.633595 -2.660429
19.630684 -2.584182
20.618181 -2.426543
21.595680 -2.215604
22.565897 -1.973365
23.554708 -1.824193
24.508381 -1.523349
25.412466 -1.095996
26.322757 -0.682028
27.216991 -0.234427
28.130066 0.173365
In this case it is already sorted. But assume these rows were randomly ordered. How can I recover the path in order?
Make the plot like this
path <- read.table("data.txt")
plot(path)
lines(path)
path <- read.table(text = "0.000000 0.000000
0.999886 0.015104
1.994528 -0.088276
2.975603 -0.281902
3.945894 -0.523844
4.906713 -0.801021
5.893859 -0.960844
6.864580 -1.201053
7.859816 -1.298548
8.856026 -1.211567
9.851185 -1.113287
10.851147 -1.121947
11.844307 -1.238707
12.800410 -1.531737
13.741038 -1.871177
14.663443 -2.257401
15.641304 -2.466656
16.641061 -2.488718
17.638100 -2.565617
18.633595 -2.660429
19.630684 -2.584182
20.618181 -2.426543
21.595680 -2.215604
22.565897 -1.973365
23.554708 -1.824193
24.508381 -1.523349
25.412466 -1.095996
26.322757 -0.682028
27.216991 -0.234427
28.130066 0.173365")
names(path) <- c("x", "y")
## Randomize points
path <- path[sample(1:nrow(path)),]
## Function to calculate distances
my.dist <- function(p1 = c(x,y), p2 = c(0,0)) sqrt((p1[1]-p2[1])^2 + (p1[2] - p2[2])^2)
dists.to.origin <- apply(path, 1, my.dist)
## Order data frame by distances.
path <- path[order(dists.to.origin),]
plot(path)
lines(path)
I'm working off a CSV of over a million rows of data taken from a rectangular sample. The CSV contains 3 columns, x coordinate (in steps of .06 or .07), y coordinate (in steps of .16 or .17) and the reading at that point. I'm looking for someway to visualize the sample in R to create a pseudo-image of the sample based in the readings using a color gradient.
Searching online, this solution seemed promising, Creating a matrix by color in R, but I'm running into issues creating a matrix from my data
x = c(unique(CTLI$xcoord)) #getting all of the different x values
y = c(unique(CTLI$ycoord)) #getting all the y values
matVar = matrix(CTLI$CTLI, nrow = x, ncol = y)
And am getting the error
Warning message:
In matrix(CTLI$CTLI, nrow = x, ncol = y) :
data length exceeds size of matrix
I'm not committed to this solution though, so any other ideas would be much appreciated. Thank you!
I played with the answer you linked too, learning some from it and get there:
CTLI <- read.table(text="xcoord,ycoord,measure
12,15,30
16,20,25
19,35,38",header=TRUE,sep=",")
mCTLI <- melt(CTLI,id.vars=c("xcoord","ycoord"))
iCTLI <- ggplot(mCTLI,aes(x=xcoord,y=ycoord,fill=value)) + geom_raster()
iCTLI
You don't have to build a matrix before, melt create a data.frame suitable for ggplot.
I am having a specific problem. Firstly I am using octave. I have a dataset where every row is of the following format:
datarow = [ x, y, z, colourIndex];
The length of the dataset is irrelevant, but suppose it is 10. I want to be able to plot the 3d plot with every point having a colour of its specific color index. Of course I know that I can use a for loop and add every point individually, but I find it hard to believe that there isn't already some way to do that using vectors.
So far I have tried:
map = cool(); #init colormap
data = initializeData(); #initialize data
plot3(data(:,1),data(:,2),data(:,3),"c" , map(data(:,4))); #doesn't work
Any ideas if it's possible to do a one-liner for my issue?
Use scatter3:
N_colors = 64;
colormap(cool(N_colors));
# point positions (your data(:, 1:3))
[x, y, z] = peaks (20);
# these are the color indexes in the colormap (your data(:, 4))
c_index = fix(rand(size(x)) * N_colors);
marker_size = 8;
scatter3(x(:), y(:), z(:), marker_size, c_index(:))
I'have a SpatialPointsDataFrame load with
pst<-readOGR("/data_spatial/coast/","points_coast")
And I would like to get a SpatialLines in output, I have find somthing
coord<-as.data.frame(coordinates(pst))
Slo1<-Line(coord)
Sli1<-Lines(list(Slo1),ID="coastLine")
coastline <- SpatialLines(list(Sli1))
class(coastline)
it seems to work but when I try plot(coastline) , I have a line that should not be there ...
Some one can help me ? The shapefile is here !
I have looked at the shapefile. There is an id column, but if you plot the data, it seems that the id is not ordered north-south or something. The extra lines are created because the point order is not perfect, connecting points that are next to each other in the table, but far from each other in terms of space. You could try to figure out the correct ordering of the data by calculating distances between points and then ordering on distance.
A workaround is to remove those lines that are longer than a certain distance, e.g. 500 m.. First, find out where distance between consecutive coordinates is larger than this distance: the breaks. Then take a subset of coordinates between two breaks and lastly create Lines for that subset. You end up with a coastline consisting of several (breaks-1) segments and without the erroneous ones.
# read data
library(rgdal)
pst<-readOGR("/data_spatial/coast/","points_coast")
coord<-as.data.frame(coordinates(pst))
colnames(coord) <- c('X','Y')
# determine distance between consective coordinates
linelength = LineLength(as.matrix(coord),sum=F)
# 'id' of long lines, plus first and last item of dataset
breaks = c(1,which(linelength>500),nrow(coord))
# check position of breaks
breaks = c(1,which(linelength>500),nrow(coord))
# plot extent of coords and check breaks
plot(coord,type='n')
points(coord[breaks,], pch=16,cex=1)
# create vector to be filled with lines of each subset
ll <- vector("list", length(breaks)-1)
for (i in 1: (length(breaks)-1)){
subcoord = coord[(breaks[i]+1):(breaks[i+1]),]
# check if subset contains more than 2 coordinates
if (nrow(subcoord) >= 2){
Slo1<-Line(subcoord)
Sli1<-Lines(list(Slo1),ID=paste0('section',i))
ll[[i]] = Sli1
}
}
# remove any invalid lines
nulls = which(unlist(lapply(ll,is.null)))
ll = ll[-nulls]
lin = SpatialLines(ll)
# add result to plot
lines(lin,col=2)
# write shapefile
df = data.frame(row.names=names(lin),id=1:length(names(lin)))
lin2 = SpatialLinesDataFrame(sl=lin, data=df)
proj4string(lin2) <- proj4string(pst)
writeOGR(obj=lin2, layer='coastline', dsn='/data_spatial/coast', driver='ESRI Shapefile')