How can I efficiently convert a pandas dataframe with x y z coordinates into 4D numpy array? - multidimensional-array

example: Let's say we have coordinates such as x,y,z (these coordinates can be different) and specific value for each coordinate as shown below:
x y z a
0 219115 166637 923 NaN
1 219116 166637 923 NaN
2 219117 166637 923 NaN
3 219118 166637 923 NaN
4 219119 166637 923 NaN
... ... ... ... ...
124995 219160 166686 972 NaN
124996 219161 166686 972 NaN
124997 219162 166686 972 NaN
124998 219163 166686 972 NaN
124999 219164 166686 972 NaN
I want to convert it into 4 dimensional numpy array as shown below. I am going to use this 4D array to save the data in TIFF file, but it requires 4 dimensional numpy array dataset, that is why I am struggling little bit.
array([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
...,
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]])
Thanks very much, hoping for the best.
I tried
`pd.pivot_table `
and then .to_numpy(),
but i could get only matrix of array, not the matrix of matrices (4D).

Related

Looping through a matrix and plotting in R

I have two matrices in R lag_mat and r_mat and both have dimensions 16x16x3x2x2.
I have the following code that I use to create plot these in R.
library(R.matlab)
library("wesanderson")
library("ggplot2")
library("ggsci")
library(corrplot)
library(plotly)
library(viridis)
#CCO left and right stimulation time window 2
lag_mat = matrix(CCO_lag[, , 1,2], 16)
r_mat = matrix(CCO[, , 1,2], 16)
row = c(row(lag_mat))
col = c(col(lag_mat))
dd = data.frame( lag = c(lag_mat), r = c(r_mat), row, col )
p1 <- ggplot(dd, aes(x = row, y = col, size = lag, color = r)) +
geom_point( alpha = 1.5, stroke = 2.5) +
ggtitle("CCO, RIGHT Stimulation") +
theme(plot.title = element_text(size=10, face="bold"),
legend.position = "none",
axis.title.x=element_blank(),
axis.title.y=element_blank(),
panel.grid.major = element_line(size = 0.5, linetype = 'solid',
colour = "white"),
panel.grid.minor = element_line(size = 0.5, linetype = 'solid',
colour = "white"),axis.text.x = element_text(size=8)) +
# scale_color_viridis( begin = 0.2 , end = 1, direction = 1 )+
scale_color_gradient2(low = "#4169E1" , mid = "#ffffbf" , high = "#FF8C00", limits=c(-1 ,1)) +
# scale_y_reverse() +
# scale_size_area(trans = "reverse")+
scale_size_continuous(range = c(5,0),limits=c(-12,0))+
scale_x_discrete(limits=c("CP1","P7","P3","Pz","PO3","T1", "M1","Oz","M2","T2","PO4","P4","P8", "CP2","Cz","Fz")) +
scale_y_continuous(limits = c(1,16),breaks=seq(1,16,1))
The issue that I am having is that I need to loop through the last dimension. I have run some more analyses and instead of the last dimension of the matrices being 2, it's now 21. I used to just have two scripts that I used, one where I plotted (i.e. each of the dimensions in different scripts - not very efficient, I know).
r_mat = matrix(CCO[, , 1,1], 16)
and the other for
r_mat = matrix(CCO[, , 1,2], 16)
But now of course I can't have 21 scripts but I'm unsure how to loop and plot in R.
Can anyone help me with this? So I could loop through the last dimension and plot 21 figures using ggplot?
Thanks!
Here is the data, I have reproduced a smaller matrix such that both matrices are not dimension 16x16x1x2.
CCO<-structure(c(-0.492578655481339, NaN, NaN, NaN, -0.492525190114975,
-0.492525696754456, NaN, -0.492627799510956, -0.492677986621857,
-0.492468953132629, NaN, NaN, NaN, -0.49228835105896, -0.492546766996384,
-0.492437690496445, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.521651923656464, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0.473261743783951, NaN, 0.472789525985718, -0.600778460502625,
NaN, NaN, -0.600829541683197, -0.6008580327034, -0.601057589054108,
NaN, -0.600822031497955, -0.600911736488342, -0.600730240345001,
NaN, NaN, NaN, -0.600953936576843, -0.600802004337311, -0.600861430168152,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.521026790142059, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.577225089073181, NaN, NaN, -0.577208399772644,
-0.577145278453827, -0.577321112155914, NaN, -0.577184557914734,
-0.577165722846985, -0.577133357524872, NaN, NaN, NaN, -0.577190637588501,
-0.577230930328369, -0.577144026756287, -0.41020467877388, NaN,
NaN, NaN, -0.410186648368835, -0.410334318876266, NaN, -0.410211980342865,
-0.410197377204895, -0.410110324621201, NaN, NaN, NaN, -0.410272806882858,
NaN, NaN, -0.733388960361481, NaN, NaN, NaN, NaN, -0.733434438705444,
NaN, -0.733347833156586, -0.733303666114807, -0.733347356319427,
NaN, NaN, NaN, NaN, -0.733397245407104, -0.73332667350769, -0.702324509620667,
NaN, NaN, NaN, NaN, NaN, NaN, -0.702237844467163, -0.702238082885742,
-0.702193081378937, NaN, NaN, NaN, -0.702261865139008, -0.702301025390625,
NaN, -0.80294394493103, NaN, NaN, -0.802956938743591, -0.802938997745514,
-0.803096830844879, NaN, -0.802961885929108, -0.802923500537872,
-0.802861630916595, NaN, NaN, NaN, -0.803063333034515, -0.802979350090027,
-0.802873134613037, -0.684592604637146, NaN, NaN, -0.684580564498901,
-0.684580743312836, -0.684802889823914, NaN, -0.684630811214447,
-0.684578239917755, -0.684465110301971, NaN, NaN, NaN, -0.684730887413025,
-0.684608578681946, -0.684436023235321, -0.606923937797546, NaN,
NaN, NaN, -0.606987476348877, NaN, NaN, -0.606982827186584, NaN,
NaN, NaN, NaN, NaN, -0.606993675231934, NaN, NaN, -0.746234655380249,
NaN, NaN, -0.7463099360466, -0.746258854866028, -0.746564209461212,
NaN, -0.746362566947937, -0.746387183666229, -0.746385276317596,
NaN, NaN, NaN, -0.746756434440613, -0.746286571025848, -0.746472299098969,
NaN, NaN, NaN, NaN, -0.526792407035828, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -0.526629209518433, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, -0.402197241783142, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, -0.515719473361969, NaN, NaN, NaN, -0.515782594680786,
-0.516006171703339, NaN, -0.515946447849274, -0.515853404998779,
-0.515883803367615, NaN, NaN, NaN, -0.515994668006897, -0.515867114067078,
-0.515911042690277, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.4820496737957, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0.535082995891571, NaN, 0.534462213516235, -0.567049205303192,
NaN, NaN, -0.567097425460815, -0.567124307155609, -0.567312657833099,
NaN, -0.567090332508087, -0.567174971103668, -0.567003667354584,
NaN, NaN, NaN, -0.567214787006378, -0.567071437835693, -0.567127525806427,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.437827885150909, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.496517241001129, NaN, NaN, -0.496502816677094,
-0.496448516845703, -0.496599793434143, NaN, -0.496482282876968,
-0.496466100215912, -0.496438264846802, NaN, NaN, NaN, -0.496487557888031,
-0.496522217988968, -0.496447324752808, 0.43168780207634, NaN,
NaN, NaN, 0.43162015080452, 0.431624948978424, NaN, 0.43173423409462,
0.431787043809891, 0.431506514549255, NaN, NaN, NaN, 0.431388199329376,
NaN, NaN, -0.673626005649567, NaN, NaN, NaN, NaN, -0.673667669296265,
NaN, -0.67358809709549, -0.673547565937042, -0.673587679862976,
NaN, NaN, NaN, NaN, -0.673633456230164, -0.673568665981293, -0.657320320606232,
NaN, NaN, NaN, NaN, NaN, NaN, -0.65728884935379, -0.657253861427307,
-0.657285273075104, NaN, NaN, NaN, -0.657291948795319, -0.657335460186005,
NaN, -0.793729186058044, NaN, NaN, -0.793741881847382, -0.793724238872528,
-0.793880224227905, NaN, -0.793746829032898, -0.793708860874176,
-0.793647706508636, NaN, NaN, NaN, -0.793846964836121, -0.793764173984528,
-0.793659150600433, -0.639408528804779, NaN, NaN, -0.639397382736206,
-0.63939756155014, -0.639605164527893, NaN, -0.639444351196289,
-0.63939505815506, -0.639289438724518, NaN, NaN, NaN, -0.639537692070007,
-0.639423429965973, -0.63926237821579, -0.567462205886841, NaN,
NaN, NaN, -0.567524492740631, NaN, NaN, -0.567518472671509, NaN,
NaN, NaN, NaN, NaN, -0.567527711391449, NaN, NaN, -0.76900988817215,
NaN, NaN, -0.769101619720459, -0.769054174423218, -0.769321501255035,
NaN, -0.769179046154022, -0.769175291061401, -0.769182145595551,
NaN, NaN, NaN, -0.769531965255737, -0.769078016281128, -0.769262313842773,
NaN, NaN, NaN, NaN, -0.0669489949941635, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.0665916055440903, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, 0.425303876399994, NaN, NaN, NaN,
NaN, NaN, NaN, NaN), .Dim = c(16L, 16L, 2L))
CCO_lag<-structure(c(0, NaN, NaN, NaN, 0, 0, NaN, 0, 1, 0, NaN, NaN, NaN,
1, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -3, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 5, NaN, 5, -3, NaN, NaN,
-3, -3, -3, NaN, -3, -3, -3, NaN, NaN, NaN, -3, -3, -3, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -1, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, -3, NaN, NaN, -3, -3, -3, NaN, -3, -3, -3, NaN, NaN,
NaN, -3, -3, -3, -4, NaN, NaN, NaN, -4, -4, NaN, -4, -4, -4,
NaN, NaN, NaN, -4, NaN, NaN, 0, NaN, NaN, NaN, NaN, 0, NaN, 0,
0, 0, NaN, NaN, NaN, NaN, 0, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN,
0, 0, 0, NaN, NaN, NaN, 0, 0, NaN, 0, NaN, NaN, 1, 0, 1, NaN,
0, 1, 1, NaN, NaN, NaN, 1, 0, 1, -2, NaN, NaN, -2, -2, -2, NaN,
-2, -2, -1, NaN, NaN, NaN, -2, -2, -2, 0, NaN, NaN, NaN, 0.5,
NaN, NaN, 0.5, NaN, NaN, NaN, NaN, NaN, 0.5, NaN, NaN, 1, NaN,
NaN, 1, 1, 1, NaN, 1, 1, 1, NaN, NaN, NaN, 1, 1, 1, NaN, NaN,
NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 0.5, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, 0, 0, NaN, 0, 0, 0, NaN,
NaN, NaN, 0, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -4, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 5, NaN, 5,
-2, NaN, NaN, -2, -2, -2, NaN, -2, -2, -2, NaN, NaN, NaN, -2,
-2, -2, NaN, NaN, NaN, NaN, NaN, NaN, NaN, -2, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -2, NaN, NaN, -2, -2, -2, NaN, -2, -2,
-2, NaN, NaN, NaN, -2, -2, -2, -2, NaN, NaN, NaN, -2, -2, NaN,
-2, -2, -2, NaN, NaN, NaN, -2, NaN, NaN, -1, NaN, NaN, NaN, NaN,
-1, NaN, -1, -1, -1, NaN, NaN, NaN, NaN, -1, -1, -1, NaN, NaN,
NaN, NaN, NaN, NaN, -1, -1, -1, NaN, NaN, NaN, -1, -1, NaN, 0,
NaN, NaN, 0, 0, 0, NaN, 0, 0, 0, NaN, NaN, NaN, 0, 0, 0, -3,
NaN, NaN, -3, -3, -3, NaN, -3, -3, -2, NaN, NaN, NaN, -3, -3,
-3, 0, NaN, NaN, NaN, 0, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, 0, NaN, NaN, 0, 0, 0, NaN, 0, 0, 0, NaN, NaN, NaN,
0, 0, 0, NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, NaN, NaN, NaN, NaN, NaN), .Dim = c(16L, 16L, 2L))
You could loop along the desired dimension of your array by using lapply(seq_len(dim(my_array)[n]), ...), wherein n is your dimension of interest.
If you then use function(i) {...} inside the lapply() and put the i at the correct spot in the subsetting operation, it should pick out the appropriate data.
If the last line of the function outputs a ggplot object, it automatically gets saved in a list. Simplified example below:
library(ggplot2)
CCO<- array(rnorm(prod(16, 2, 1, 21)), c(16, 2, 1, 21))
CCO_lag <- array(rnorm(prod(16, 2, 1, 21)), c(16, 2, 1, 21))
plots <- lapply(seq_len(dim(CCO)[4]), function(i) {
lag_mat = matrix(CCO_lag[, , 1,i], 16)
r_mat = matrix(CCO[, , 1,i], 16)
row = c(row(lag_mat))
col = c(col(lag_mat))
dd = data.frame( lag = c(lag_mat), r = c(r_mat), row, col )
ggplot(dd, aes(x = row, y = col)) +
geom_point(alpha = 1.5, stroke = 2.5)
})
# Just to show plots come out
patchwork::wrap_plots(plots)
Created on 2021-01-07 by the reprex package (v0.3.0)

Reorder row names of a matrices in a list and replace NaN and zeros with ones

I have a list of matrices and I need to order their row names. I have 7 letter rating categories but the row names are a combination of two ratings separated by a hyphen. I would like the row names to be sorted according to the rating before the hyphen. After solving this problem, I'd like to convert the NaN values and 0 values to ones since I have to take the log of every element in the matrices. However, when I replace the NaN with 1 and then proceed to replace the 0 values with 1 the NaN values reappear again.
The object row.order contains the order I would like to follow.
row.order <- c("Aaa", "Aa", "A", "Baa", "Ba", "B", "Caa")
The dput of the list of matrices:
dput(phij.list)
list(structure(c(0.375, 0.268292682926829, 0.384615384615385,
NaN, NaN, 0.222222222222222, NaN, 0.4375, 0.51219512195122, 0.282051282051282,
NaN, NaN, 0.444444444444444, NaN, 0.0625, 0.195121951219512,
0.230769230769231, NaN, NaN, 0.333333333333333, NaN, 0.125, 0.024390243902439,
0.0769230769230769, NaN, NaN, 0, NaN, 0, 0, 0.0256410256410256,
NaN, NaN, 0, NaN, 0, 0, 0, NaN, NaN, 0, NaN, 0, 0, 0, NaN, NaN,
0, NaN), .Dim = c(7L, 7L), .Dimnames = list(hi = c("A-Aaa", "Aa-Aaa",
"Aaa-Aaa", "B-Aaa", "Ba-Aaa", "Baa-Aaa", "Caa-Aaa"), j = c("Aaa",
"Aa", "A", "Baa", "Ba", "B", "Caa"))), structure(c(0.0425531914893617,
0.0641509433962264, 0.27906976744186, 0.0714285714285714, 0,
0.0625, 0, 0.425531914893617, 0.532075471698113, 0.418604651162791,
0.428571428571429, 0.551724137931034, 0.453125, 0, 0.304964539007092,
0.211320754716981, 0.162790697674419, 0.214285714285714, 0.275862068965517,
0.25, 0, 0.113475177304965, 0.132075471698113, 0.116279069767442,
0.142857142857143, 0.137931034482759, 0.140625, 0, 0.0921985815602837,
0.0452830188679245, 0.0232558139534884, 0.142857142857143, 0,
0.0625, 1, 0.0212765957446809, 0.0150943396226415, 0, 0, 0.0344827586206897,
0.03125, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(7L, 7L), .Dimnames = list(
hi = c("A-Aa", "Aa-Aa", "Aaa-Aa", "B-Aa", "Ba-Aa", "Baa-Aa",
"Caa-Aa"), j = c("Aaa", "Aa", "A", "Baa", "Ba", "B", "Caa"
))), structure(c(0.00769230769230769, 0.0775193798449612,
0.0869565217391304, 0, 0, 0.00671140939597315, 0, 0.188461538461538,
0.317829457364341, 0.173913043478261, 0.296296296296296, 0.037037037037037,
0.23489932885906, 0.5, 0.496153846153846, 0.341085271317829,
0.478260869565217, 0.333333333333333, 0.462962962962963, 0.342281879194631,
0, 0.207692307692308, 0.193798449612403, 0.260869565217391, 0.222222222222222,
0.333333333333333, 0.281879194630872, 0.5, 0.0884615384615385,
0.062015503875969, 0, 0.111111111111111, 0.111111111111111, 0.087248322147651,
0, 0.00384615384615385, 0.00775193798449612, 0, 0.037037037037037,
0.0555555555555556, 0.0402684563758389, 0, 0.00769230769230769,
0, 0, 0, 0, 0.00671140939597315, 0), .Dim = c(7L, 7L), .Dimnames = list(
hi = c("A-A", "Aa-A", "Aaa-A", "B-A", "Ba-A", "Baa-A", "Caa-A"
), j = c("Aaa", "Aa", "A", "Baa", "Ba", "B", "Caa"))), structure(c(0.0196078431372549,
0.0434782608695652, 0.166666666666667, 0, 0, 0.0116959064327485,
0, 0.163398692810458, 0.159420289855072, 0.666666666666667, 0.0571428571428571,
0.0648148148148148, 0.0994152046783626, 0, 0.300653594771242,
0.347826086956522, 0.166666666666667, 0.285714285714286, 0.222222222222222,
0.251461988304094, 0.333333333333333, 0.274509803921569, 0.260869565217391,
0, 0.314285714285714, 0.37037037037037, 0.350877192982456, 0.333333333333333,
0.163398692810458, 0.130434782608696, 0, 0.228571428571429, 0.194444444444444,
0.233918128654971, 0.333333333333333, 0.065359477124183, 0.0579710144927536,
0, 0.114285714285714, 0.12037037037037, 0.0526315789473684, 0,
0.0130718954248366, 0, 0, 0, 0.0277777777777778, 0, 0), .Dim = c(7L,
7L), .Dimnames = list(hi = c("A-Baa", "Aa-Baa", "Aaa-Baa", "B-Baa",
"Ba-Baa", "Baa-Baa", "Caa-Baa"), j = c("Aaa", "Aa", "A", "Baa",
"Ba", "B", "Caa"))), structure(c(0, 0, 0, 0, 0, 0, 0, 0.150943396226415,
0.212121212121212, 1, 0.02, 0.0285714285714286, 0.0925925925925926,
0, 0.264150943396226, 0.272727272727273, 0, 0.06, 0.104761904761905,
0.138888888888889, 0.214285714285714, 0.415094339622642, 0.212121212121212,
0, 0.12, 0.238095238095238, 0.333333333333333, 0.0714285714285714,
0.0754716981132075, 0.272727272727273, 0, 0.4, 0.333333333333333,
0.305555555555556, 0.214285714285714, 0.0754716981132075, 0,
0, 0.36, 0.247619047619048, 0.101851851851852, 0.357142857142857,
0.0188679245283019, 0.0303030303030303, 0, 0.04, 0.0476190476190476,
0.0277777777777778, 0.142857142857143), .Dim = c(7L, 7L), .Dimnames = list(
hi = c("A-Ba", "Aa-Ba", "Aaa-Ba", "B-Ba", "Ba-Ba", "Baa-Ba",
"Caa-Ba"), j = c("Aaa", "Aa", "A", "Baa", "Ba", "B", "Caa"
))), structure(c(0, 0, NaN, 0, 0, 0, 0, 0, 0.2, NaN, 0.0508474576271186,
0.0476190476190476, 0.128205128205128, 0.0476190476190476, 0.25,
0.2, NaN, 0.101694915254237, 0.142857142857143, 0.179487179487179,
0, 0.333333333333333, 0.4, NaN, 0.0677966101694915, 0.174603174603175,
0.230769230769231, 0.0952380952380952, 0.25, 0.2, NaN, 0.271186440677966,
0.238095238095238, 0.256410256410256, 0.19047619047619, 0.166666666666667,
0, NaN, 0.355932203389831, 0.285714285714286, 0.153846153846154,
0.523809523809524, 0, 0, NaN, 0.152542372881356, 0.111111111111111,
0.0512820512820513, 0.142857142857143), .Dim = c(7L, 7L), .Dimnames = list(
hi = c("A-B", "Aa-B", "Aaa-B", "B-B", "Ba-B", "Baa-B", "Caa-B"
), j = c("Aaa", "Aa", "A", "Baa", "Ba", "B", "Caa"))), structure(c(0,
NaN, NaN, 0, 0, 0, 0, 0, NaN, NaN, 0, 0, 0.142857142857143, 0,
0, NaN, NaN, 0, 0.142857142857143, 0, 0, 0.333333333333333, NaN,
NaN, 0.0526315789473684, 0.214285714285714, 0, 0.0666666666666667,
0.666666666666667, NaN, NaN, 0.263157894736842, 0.142857142857143,
0.428571428571429, 0.0666666666666667, 0, NaN, NaN, 0.473684210526316,
0.214285714285714, 0.285714285714286, 0.466666666666667, 0, NaN,
NaN, 0.210526315789474, 0.285714285714286, 0.142857142857143,
0.4), .Dim = c(7L, 7L), .Dimnames = list(hi = c("A-Caa", "Aa-Caa",
"Aaa-Caa", "B-Caa", "Ba-Caa", "Baa-Caa", "Caa-Caa"), j = c("Aaa",
"Aa", "A", "Baa", "Ba", "B", "Caa"))))
The code I'm using to change NaN to 1:
lapply(phij.list, function(x) replace(x, !is.finite(x), 1))
The code I'm using to change the 0 values to 1
lapply(phij.list, function(x) replace(x, x==0, 1))
You can use sub to remove text after "-" in rownames and then match them with row.order to get the correct order. We can then replace NaN and 0 values with 1.
new_list <- lapply(phij.list, function(x) {
temp <- x[match(sub('-.*', '', rownames(x)), row.order), ]
replace(temp, is.nan(temp) | temp == 0, 1)
})
#[[1]]
# j
#hi Aaa Aa A Baa Ba B Caa
# Aaa-Aaa 0.385 0.282 0.2308 0.0769 0.0256 1 1
# Aa-Aaa 0.268 0.512 0.1951 0.0244 1.0000 1 1
# A-Aaa 0.375 0.438 0.0625 0.1250 1.0000 1 1
# Baa-Aaa 0.222 0.444 0.3333 1.0000 1.0000 1 1
# Ba-Aaa 1.000 1.000 1.0000 1.0000 1.0000 1 1
# B-Aaa 1.000 1.000 1.0000 1.0000 1.0000 1 1
# Caa-Aaa 1.000 1.000 1.0000 1.0000 1.0000 1 1
#[[2]]
# j
#hi Aaa Aa A Baa Ba B Caa
# Aaa-Aa 0.2791 0.419 0.163 0.116 0.0233 1.0000 1
# Aa-Aa 0.0642 0.532 0.211 0.132 0.0453 0.0151 1
# A-Aa 0.0426 0.426 0.305 0.113 0.0922 0.0213 1
# Baa-Aa 0.0625 0.453 0.250 0.141 0.0625 0.0312 1
# Ba-Aa 1.0000 0.552 0.276 0.138 1.0000 0.0345 1
# B-Aa 0.0714 0.429 0.214 0.143 0.1429 1.0000 1
# Caa-Aa 1.0000 1.000 1.000 1.000 1.0000 1.0000 1
#...
#...

xarray DataArray.where() reduced coordinate when masking

xarray novice here. Very simple case, I have a precipitation type array (ntim x nlat x nlon) and a total precipitation array (same dimensions). Both are in separate netCDF files. I want to mask the precipitation array where A) precipitation is falling (> 1e-8 m/s rate) and B) the precipitation type is snow (maskvar = 0.0). The output array is therefore a "where is it snowing?" array.
When using xarray where() with multiple conditions from two different (but same-sized) arrays, only two latitudes persist (north and south pole) in the resulting masked array.
However, if I use a pre-masked array (from NCL, written as netCDF w/ same dims) as a test, it behaves as expected (i.e., returns ntim x nlat x nlon) array.
The only obvious thing that sticks out to me are that the lat coordinate is not identically typed between both arrays, although it's unclear why that would cause this to fail in this manner.
Any help appreciated.
Sample code:
ensnum='001'
indir = '/glade/u/home/zarzycki/scratch/LENS-snow/'
files = [indir+'/b.e11.B20TRC5CNBDRD.f09_g16.'+ensnum+'.cam.h2.PTYPE.1990010100Z-2005123118Z.nc']
indir2 = '/glade/p_old/cesmLE/CESM-CAM5-BGC-LE/atm/proc/tseries/hourly6/PRECT/'
files2 = [indir2+'/b.e11.B20TRC5CNBDRD.f09_g16.'+ensnum+'.cam.h2.PRECT.1990010100Z-2005123118Z.nc']
indir3 = indir
files3 = [indir3+'/b.e11.B20TRC5CNBDRD.f09_g16.'+ensnum+'.cam.h2.PRECT_SNOW.1990010100Z-2005123118Z.nc']
for idx, val in enumerate(files):
ds = xr.open_dataset(files[idx])
ds2 = xr.open_dataset(files2[idx])
ds3 = xr.open_dataset(files3[idx])
ptype = ds.PTYPE[1:11,:,:] # 10 time x 192 lat x 288 lon
prect1 = ds2.PRECT[1:11,:,:] # 10 time x 192 lat x 288 lon
prect2 = ds3.PRECT_SNOW[1:11,:,:] # 10 time x 192 lat x 288 lon
print('---------')
print(ptype)
print(prect1)
print(prect2)
ptype1 = ptype.where((ptype > -0.1) & (ptype < 0.1) & (prect1 > 1e-8))
ptype2 = ptype.where((ptype > -0.1) & (ptype < 0.1) & (prect2 > 1e-8))
print('---------')
print(ptype1)
print(ptype2)
Sample output showing that all read vars are (time: 10, lat: 192, lon: 288) but returned masked vars are (time: 10, lat: 2, lon: 288) and (time: 10, lat: 192, lon: 288)
---------
<xarray.DataArray 'PTYPE' (time: 10, lat: 192, lon: 288)>
[552960 values with dtype=float32]
Coordinates:
* lat (lat) float32 -90.0 -89.0576 -88.1152 -87.1728 -86.2304 -85.288 ...
* lon (lon) float32 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 11.25 ...
* time (time) datetime64[ns] 1990-01-01T12:00:00 1990-01-01T18:00:00 ...
<xarray.DataArray 'PRECT' (time: 10, lat: 192, lon: 288)>
[552960 values with dtype=float32]
Coordinates:
* lat (lat) float64 -90.0 -89.06 -88.12 -87.17 -86.23 -85.29 -84.35 ...
* lon (lon) float64 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 11.25 ...
* time (time) datetime64[ns] 1990-01-01T12:00:00 1990-01-01T18:00:00 ...
Attributes:
units: m/s
long_name: Total (convective and large-scale) precipitation rate (liq...
cell_methods: time: mean
<xarray.DataArray 'PRECT_SNOW' (time: 10, lat: 192, lon: 288)>
[552960 values with dtype=float32]
Coordinates:
* lat (lat) float32 -90.0 -89.0576 -88.1152 -87.1728 -86.2304 -85.288 ...
* lon (lon) float32 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 11.25 ...
* time (time) datetime64[ns] 1990-01-01T12:00:00 1990-01-01T18:00:00 ...
Attributes:
units: m/s
---------
<xarray.DataArray (time: 10, lat: 2, lon: 288)>
array([[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
...,
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]]], dtype=float32)
Coordinates:
* lat (lat) float64 -90.0 90.0
* lon (lon) float32 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 11.25 ...
* time (time) datetime64[ns] 1990-01-01T12:00:00 1990-01-01T18:00:00 ...
<xarray.DataArray (time: 10, lat: 192, lon: 288)>
array([[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan],
...,
[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan],
...,
[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
...,
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan],
...,
[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]],
[[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan],
...,
[ nan, nan, ..., nan, nan],
[ nan, nan, ..., nan, nan]]], dtype=float32)
Coordinates:
* lat (lat) float32 -90.0 -89.0576 -88.1152 -87.1728 -86.2304 -85.288 ...
* lon (lon) float32 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 11.25 ...
* time (time) datetime64[ns] 1990-01-01T12:00:00 1990-01-01T18:00:00 ...

R - convert nan to 0 results in all 0's

I have a data frame containing NaN's that I'd like to convert to 0's. I wrote a function that I think should work:
fix_nan <- function(x){
return(x[is.nan(x)] <- 0)
}
And then I apply it to the data frame:
train_e <- structure(list(pack_id = structure(1:10, .Label = c("1", "2",
"4", "5", "7", "8", "9", "10", "11", "14"), class = "factor"),
item_1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), item_2 = c(NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN), item_3 = c(1.45225232891169,
0.613104472886409, NaN, 1.02450431651439, 0.735706794978741,
0.741937344729377, NaN, 0.83034830207343, 0.97650959186721,
0.750305594399894), item_4 = c(0.645137961373585, 0.615792803650477,
Inf, 0.752866415261568, 0.84901755126673, 0.646398200985872,
Inf, 0.786548355648346, 0.725113372622438, 0.709897990984761
), item_5 = c(NaN, NaN, NaN, 0, 0, 0, NaN, NaN, 0, 0), item_6 = c(0.510825623765991,
0.510825623765991, NaN, 0.510825623765991, 0.510825623765991,
0.510825623765991, NaN, 0.510825623765991, 0.847297860387204,
0.510825623765991)), .Names = c("pack_id", "item_1", "item_2",
"item_3", "item_4", "item_5", "item_6"), row.names = c(26155L,
6236L, 6281L, 6014L, 6035L, 26217L, 5576L, 6316L, 5594L, 26244L
), class = "data.frame")
vtf1 <- c('item_1','item_2','item_3','item_4','item_5','item_6')
train_e[,vtf1] <- as.data.frame(lapply(train_e[,vtf1], fix_nan))
head(train_e)
And I get all 0's:
> head(train_e)
pack_id item_1 item_2 item_3 item_4 item_5 item_6
26155 1 0 0 0 0 0 0
6236 2 0 0 0 0 0 0
6281 4 0 0 0 0 0 0
6014 5 0 0 0 0 0 0
6035 7 0 0 0 0 0 0
26217 8 0 0 0 0 0 0
Any suggestions ?
x[is.nan(x)] <- 0 returns only those elements of x that were NaN (and are now zero). To fix this, change your function:
fix_nan <- function(x){
x[is.nan(x)] <- 0
x
}

R cumulative sum calculation NA issue

I have been trying to solve a cumulative sum issue for a couple of days and have gotten extremely close, but am still encountering a few problems.
I'm trying to calculate a cumulative sum backwards (from nrow up to the first row) for multiple columns in a data.frame. The code works perfectly when there are no NA/NaN values at the end of the data.frame. But if an NA value is present, the code returns an actual value, when instead I'd like it to return NA. Also, I need the ending value (RBH row in df2) to be present for the last year in which I have a measurement.
Sample measurements for df2:
2009 - 1.2
2010 - 1.8
2011 - NaN
2012 - NaN
RBH - 60.5
Intended Output (would be in df3):
2008 - 57.5
2009 - 58.7
2010 - 60.5
2011 - NaN
2012 - NaN
What my current code gives me for df3:
2008 - 57.5
2009 - 58.7
2010 - 60.5
2011 - 59.5
2012 - 60.5
Code I'm trying:
#Build the function to deal with NA values (ex: died in 2010, NA for 2011 & 2012):
cumsum.alt <- function(x){
res <- NaN*seq(x)
for(i in seq(x)){
if(sum(is.na(x[1])) == i){
res[i] <- i
} else {
res[i] <- sum(x[1:i], na.rm=TRUE)
}
}
res
}
#Run function to produce annual radius:
##STILL NEED TO FIX NA ISSUE
df3 <- apply(df2[nrow(df2):1,], 2, function(x) c(x[1], x[1]-cumsum.alt(x[-1])))
df3 <- df3[nrow(df3):1,]
Reproducible Data.frame:
df2 <- structure(list(AP2D005 = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, 1.896, 4.221, 1.204, 1.934, 1.859, 1.575, 1.602, 1.705,
1.413, 0.786, 1.352, 0.903, 0.821, 1.0855, 1.3375, 1.554, 1.605,
1.192, 1.395, 1.6965, 1.016, 1.0835, 1.464, 2.0505, 1.719, 2.067,
2.0025, 1.9245, 2.4895, 2.3465, 2.0105, 0.897, 1.004, 1.6785,
2.4405, 3.0625, 2.173, 2.629, 3.014, 2.7245, 3.2625, 3.115, 1.515,
2.632, 2.067, 2.8155, 2.914, 2.3865, 1.976, 2.3085, 3.1135, 3.476,
3.671, 2.1465, 3.0125, 2.129, 1.8335, 0.689, 0.8775, 1, 1.616,
1.618, 2.5385, 1.9465, 1.799, 1.194, 0.7295, 0.7425, 0.5895,
131.85), AP2D006 = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, 3.64, 2.972, 1.402, 1.421,
1.622, 1.648, 2.379, 2.014, 2.182, 0.802, 1.812, 1.139, 1.042,
1.5435, 2.097, 2.064, 1.205, 1.955, 1.2985, 1.6255, 1.697, 2.3645,
2.6805, 2.2965, 2.3095, 2.082, 2.4395, 1.863, 1.879, 2.2505,
2.648, 2.5805, 2.6895, 2.587, 3.393, 3.1505, 3.543, 2.765, 0.7355,
0.508, 0.5035, 0.681, 1.0305, 1.308, 1.966, 2.32, 1.814, 2.847,
2.5295, 1.262, 2.058, 1.5235, 2.1625, 2.1215, 1.3525, 1.368,
1.574, 2.1725, 2.8545, 2.219, 1.717, 2.0185, 1.128, 1.1475, 0.591,
0.4725, 0.44, 0.485, 0.5375, 0.5215, 0.5845, 0.565, 0.5065, 0.367,
0.353, 0.2545, 121.5), AP2D009 = c(NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 1.485, 1.695, 1.655,
1.1835, 1.324, 1.0755, 0.7495, 1.014, 1.2435, 1.841, 1.8845,
1.148, 1.066, 1.926, 2.5395, 1.5005, 1.59, 1.3565, 1.5405, 1.7205,
1.5825, 1.245, 1.883, 1.907, 2.149, 1.512, 0.8935, 0.6925, 0.687,
1.265, 1.5055, 0.4295, 0.3495, 0.4275, 0.4615, 0.5665, 0.4045,
0.309, 0.187, 0.2205, 0.2705, 0.6155, 0.9485, 0.977, 0.7205,
1.3575, 1.4925, 1.43, 1.1535, 1.3195, 1.184, 1.1885, 0.5415,
0.7375, 0.7455, 1.08, 1.2335, 1.269, 1.1135, 1.193, 0.535, 0.4935,
0.349, 0.2665, 71.1), AP2D101 = c(NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 2.549, 1.393, 1.54,
1.821, 1.65, 1.357, 1.742, 1.629, 2.11, 2.11, 1.972, 1.58, 1.88,
1.9745, 1.3035, 1.0575, 1.5935, 1.6695, 1.4555, 2.306, 2.4825,
2.1905, 3.2565, 3.599, 3.058, 1.5925, 0.8025, 0.4385, 0.514,
0.6395, 0.581, 0.476, 0.5115, 0.864, 1.348, 0.6565, 0.3845, 0.35,
0.2895, 0.4045, 0.471, 0.2795, 0.365, 0.256, 0.2685, 0.444, 0.329,
0.1945, 0.1995, 0.307, 0.28, 0.1935, 0.1925, 0.176, 0.156, 0.1955,
0.1915, 0.2485, 0.236, 0.192, 0.1785, 0.1745, NaN, 77.85), AP2D102 = c(NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, 1.083, 0.596, 0.8295, 0.341, 0.302, 0.1795, 0.3505, 0.2935,
0.792, 0.796, 0.794, 0.5485, 0.6185, 1.1145, 0.6725, 0.542, 0.5935,
0.92, 1.058, 1.3855, 1.089, 1.1255, 1.5755, 1.096, 0.865, 0.771,
0.359, 0.5065, 0.6805, 1.011, 0.6695, 0.916, 0.9635, 0.997, 1.223,
1.2305, 0.549, 0.5075, 0.3985, 0.6935, 0.8915, 0.592, 1.0005,
0.9545, 1.0675, 1.0905, 1.3205, 0.849, 0.9155, 0.759, 1.131,
0.545, 0.6075, 0.696, 0.7745, 0.707, 1.095, 1.081, 1.0935, 0.771,
0.407, 0.417, 0.2815, 58.05), AP2D103 = c(NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 1.89, 0.637, 0.655, 0.728, 0.496, 0.6405,
0.647, 0.519, 0.5245, 0.784, 0.5065, 0.3155, 0.888, 1.29, 1.078,
2.117, 1.9445, 0.537, 1.483, 0.72, 1.4035, 1.875, 1.5105, 1.917,
2.2765, 3.26, 4.4505, 2.934, 2.176, 3.1805, 3.9025, 2.613, 0.704,
1.123, 0.8075, 1.241, 1.146, 1.3415, 0.9385, 1.264, 0.9355, 0.5185,
0.515, 0.3635, 67.05), AP3B012 = c(NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 0.384, 1.387, 0.913, 2.094, 1.9315, 1.6805,
1.786, 1.9035, 0.9345, 1.1825, 0.745, 0.402, 0.4425, 1.06, 0.796,
0.865, 2.0025, 1.217, 2.362, 2.5695, 2.6205, 2.046, 2.886, 1.7505,
3.9255, 2.385, 3.291, 1.9035, 3.952, 0.9955, 1.1625, 0.8605,
0.5925, 0.894, 0.645, 0.808, 0.848, 1.126, 0.9, 0.842, 1.3375,
0.987, 0.715, 1.0145, 1.181, 1.282, 0.781, 1.0705, 1.198, 1.1105,
1.361, 1.523, 1.367, 2.099, 1.632, 1.482, 1.109, 0.915, 0.7505,
1.041, 1.362, 1.2815, 1.452, 0.8735, 0.7945, 1.4145, 1.053, 0.604,
0.496, 0.5095, 0.6825, 0.692, 0.765, 0.8125, 0.6225, 0.704, 0.8455,
0.8555, 0.9605, 1.374, 0.9885, 1.0875, 0.818, 0.608, 0.3745,
0.477, 0.493, 0.389, 0.5445, 0.5195, 0.416, 0.3045, 0.388, 0.475,
117.45), AP3C003 = c(NaN, NaN, NaN, 0.864, 1.303, 1.526, 1.755,
1.6755, 1.966, 0.9955, 1.826, 2.419, 1.3455, 2.674, 1.2985, 1.136,
1.2045, 1.4395, 1.207, 1.6155, 0.747, 0.3255, 0.5825, 0.6715,
0.7875, 0.5075, 0.7915, 0.6295, 1.0015, 1.0655, 0.791, 0.7365,
0.811, 0.8255, 0.976, 0.886, 0.742, 0.6495, 1.174, 0.7135, 0.5695,
0.4335, 0.403, 0.7665, 0.7705, 0.7535, 0.7935, 0.816, 0.648,
0.609, 0.804, 0.868, 0.6895, 0.633, 0.8025, 0.952, 0.5745, 0.7275,
0.9395, 0.9125, 1.1655, 1.1725, 1.167, 1.716, 1.7405, 0.899,
0.689, 1.2195, 0.566, 1.056, 1.3895, 1.5445, 1.6875, 0.9655,
0.738, 0.9635, 1.0905, 0.5625, 0.555, 0.499, 0.723, 1.0425, 1.143,
0.9495, 0.991, 1.1495, 1.119, 1.637, 1.4185, 1.8495, 1.617, 1.5595,
0.8665, 0.693, 0.5455, 0.4755, 0.4495, 0.4355, 0.461, 0.437,
0.4485, 0.3075, 0.4915, 0.324, 97.2), AP3C004 = c(NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, 1.213, 2.051, 1.9785, 2.014,
1.8175, 1.521, 1.41, 1.6, 1.1845, 1.523, 0.7555, 0.49, 0.3245,
0.3685, 0.396, 0.386, 0.6635, 0.7135, 1.3875, 1.303, 0.6915,
1.26, 1.047, 1.717, 2.556, 1.3405, 1.8075, 1.1115, 1.9395, 0.956,
1.2815, 1.182, 0.986, 1.3365, 0.85, 1.133, 1.2705, 1.44, 1.1495,
0.9655, 1.019, 1.1335, 0.8955, 1.0525, 0.9475, 0.777, 0.5705,
0.841, 0.7975, 0.8365, 0.997, 0.8865, 1.072, 1.1055, 1.1845,
0.769, 0.713, 0.423, 0.557, 0.5115, 0.616, 0.591, 0.8395, 0.834,
0.603, 1.0795, 0.8225, 0.6915, 0.389, 0.587, 0.599, 0.678, 0.541,
0.724, 0.8325, 0.929, 0.955, 1.341, 1.2635, 1.265, 1.1235, 1.29,
0.889, 0.901, 0.589, 0.5495, 1.116, 0.945, 1.084, 1.097, 0.9305,
0.636, 1.1145, 1.0885, 107.55), AP3C006 = c(NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, 2.192, 1.629, 1.7385, 1.529,
1.0845, 1.385, 2.1, 1.262, 1.6985, 1.178, 0.4605, 0.246, 0.395,
0.3085, 0.3435, 0.4205, 0.3575, 0.6065, 0.845, 0.7185, 0.4835,
0.374, 0.841, 1.1355, 0.88, 1.6065, 0.938, 1.951, 1.294, 1.1305,
0.6615, 0.532, 0.991, 0.7385, 0.72, 0.6515, 1.016, 0.701, 0.649,
0.745, 1.064, 0.8215, 0.7775, 0.7215, 0.6425, 0.531, 0.715, 0.5485,
0.5125, 0.535, 0.556, 0.646, 0.761, 0.8585, 0.502, 0.433, 0.3585,
0.288, 0.3925, 0.4115, 0.4905, 0.5765, 0.3925, 0.296, 0.447,
0.466, 0.355, 0.2435, 0.203, 0.2455, 0.276, 0.2345, 0.241, 0.262,
0.2295, 0.2775, 0.367, 0.4045, 0.3855, 0.436, 0.486, 0.391, 0.331,
0.2745, 0.202, 0.2225, 0.252, 0.142, 0.161, NaN, NaN, NaN, NaN,
71.55), AP3C007 = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0.309, 1.271, 1.951, 1.188, 1.279, 0.993, 0.712, 0.751, 1.01,
0.9855, 1.1135, 1.0285, 0.8715, 0.491, 0.6965, 0.712, 0.564,
0.761, 0.5115, 0.9185, 1.415, 0.668, 0.915, 0.561, 1.469, 1.8795,
1.6, 2.2705, 1.307, 2.2295, 1.7, 0.7895, 0.5585, 0.4355, 0.6825,
0.7255, 0.8215, 0.977, 0.8305, 0.658, 0.763, 0.776, 0.569, 0.4475,
0.4725, 0.7665, 0.632, 0.5215, 0.6645, 0.7025, 0.7235, 0.872,
0.6635, 0.8305, 1.112, 0.9745, 0.6345, 0.605, 0.325, 0.333, 0.489,
0.4165, 0.5165, 0.681, 0.63, 0.494, 0.633, 0.5205, 0.3675, 0.3925,
0.357, 0.3945, 0.355, 0.3895, 0.522, 0.4945, 0.4045, 0.4335,
0.5165, 0.534, 0.703, 0.6705, 0.902, 0.5525, 0.499, 0.298, 0.2415,
0.1995, 0.217, 0.2215, 0.2945, 0.3755, 0.2775, 0.299, 0.243,
74.7), AP3C009 = c(NaN, NaN, NaN, NaN, NaN, NaN, 1.27, 1.569,
1.835, 0.497, 0.868, 1.247, 0.8285, 1.2515, 0.933, 0.9325, 0.89,
1.053, 1.1155, 1.534, 1.1725, 0.509, 0.453, 0.669, 0.6005, 0.4645,
0.764, 0.9665, 1.6815, 2.199, 1.459, 1.819, 1.3145, 1.6195, 2.505,
2.5875, 3.046, 2.106, 3.367, 1.8815, 2.1315, 1.559, 1.3835, 2.3815,
1.894, 2.088, 2.3115, 2.7445, 2.0005, 1.383, 1.92, 2.1055, 1.532,
1.6305, 2.055, 1.7215, 1.4205, 1.4015, 1.459, 1.53, 2.0205, 1.496,
1.362, 1.923, 1.9535, 1.4275, 1.0955, 0.6085, 0.5295, 0.634,
0.9845, 1.1095, 1.4335, 0.6545, 0.5525, 0.842, 0.949, 0.5215,
0.3105, 0.311, 0.4625, 0.4255, 0.326, 0.419, 0.318, 0.336, 0.456,
0.502, 0.69, 0.953, 0.5705, 0.913, 0.5185, 0.5145, 0.3585, 0.2685,
0.334, 0.2435, 0.3295, 0.32, 0.32, 0.225, 0.268, 0.1815, 116.1
)), .Names = c("AP2D005", "AP2D006", "AP2D009", "AP2D101", "AP2D102",
"AP2D103", "AP3B012", "AP3C003", "AP3C004", "AP3C006", "AP3C007",
"AP3C009"), row.names = c("1909", "1910", "1911", "1912", "1913",
"1914", "1915", "1916", "1917", "1918", "1919", "1920", "1921",
"1922", "1923", "1924", "1925", "1926", "1927", "1928", "1929",
"1930", "1931", "1932", "1933", "1934", "1935", "1936", "1937",
"1938", "1939", "1940", "1941", "1942", "1943", "1944", "1945",
"1946", "1947", "1948", "1949", "1950", "1951", "1952", "1953",
"1954", "1955", "1956", "1957", "1958", "1959", "1960", "1961",
"1962", "1963", "1964", "1965", "1966", "1967", "1968", "1969",
"1970", "1971", "1972", "1973", "1974", "1975", "1976", "1977",
"1978", "1979", "1980", "1981", "1982", "1983", "1984", "1985",
"1986", "1987", "1988", "1989", "1990", "1991", "1992", "1993",
"1994", "1995", "1996", "1997", "1998", "1999", "2000", "2001",
"2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009",
"2010", "2011", "2012", "RBHinBarkmm"), class = "data.frame")
Any help would be great. Thanks!
I believe this does what you descibe
cumsum.alt<-function(x) {
rh <- x[length(x)]
rx <- rev(x)[-1]
r <- rep(NA, length(x))
dx <- rh-cumsum(c(0,rx[!is.na(rx)]))
r[c(!is.na(rx), FALSE)] <- dx[-length(dx)]
r[max(which(!is.na(r)))+1] <- dx[length(dx)]
rev(r)
}
cumsum.alt(c(1,2,3,NA,50))
# [1] 44 45 47 50 NA
cumsum.alt(c(NA,1,2,3,50))
# [1] NA 44 45 47 50
I'm not sure I understand the question correctly. It's just a cumsum that fills in NA entries with NA instead of the previous known value, right?
cumsum.alt <- function(x){
res <- rep(NA,length(x))
sumtohere <- 0
for(i in seq(x)){
if (!is.na(x[i])){
sumtohere <- sumtohere+x[i]
res[i] <- sumtohere
} else {
res[i] <- NA
}
}
res
}
What's this talk about needing the last row to have a value? All these example last rows have values. If it's NA, what should it be filled with?

Resources