I am trying to plot 3D graphs, but I need custom Range for that. For example X axis has values [1,2,3,4,5,6,7]
for Y and Z, both of them have [1,2,3,4,5,6,7,8,9,10]
The idea is that I want to test if X influences Y and Z. Y and Z are dependent variables.
I tried the following code, and the result is below:
persp(mat, col = heat.colors(20) ,phi = 30, theta = -30, scale = TRUE)
mat is matrix for the following format..
V1 V2 V3
1 1 1.000000
2 1 1.709133
4 1 3.278188
8 1 5.082078
16 1 5.753403
32 1 5.778228
64 1 5.783567
1 2 1.000000
2 2 1.789333
4 2 3.478188
8 2 5.182078
16 2 5.853403
32 2 5.877228
64 2 5.908357
...... V2 will have same format till 10
But I still couldn't custom ranges for X, Y and Z with the required ranges. Any idea how to custom it or if there is any other ways in R ?
Related
I'm looking for an easy way to add the minimum value for each column inside my dataframe.
This feels like a common thing, but I haven't been able to find any good answers yet...maybe I'm missing something obvious.
Let's say I've got two columns (in reality I have close to 100) with positive and negative numbers.
w <- c(9, 9, 9, 9)
x <- c(-2, 0, 1, 3)
y <- c(-1, 1, 3, 4)
z <- as.data.frame(cbind(w, x, y))
w x y
1 9 -2 -1
2 9 0 1
3 9 1 3
4 9 3 4
I want z to look like this after a transformation for only x and y columns [,2:3]
w x y
1 9 0 0
2 9 2 2
3 9 3 4
4 9 5 5
Does that make sense?
library(dplyr)
dplyr::mutate(z, across(c(x, y), ~ . + abs(min(.))))
w x y
1 9 0 0
2 9 2 2
3 9 3 4
4 9 5 5
You can also do by column position rather than column name by changing c(x,y) to 2:3 or c(2:3, 5) for non-sequential column positions.
Depends exactly what you mean and what you want to happen if there aren't negative values. No matter the values, this will anchor the minimum at 0, but you should be able to adapt it if you want something slightly different.
z[] = lapply(z, function(col) col - min(col))
z
# x y
# 1 0 0
# 2 2 2
# 3 3 4
# 4 5 5
As a side note, as.data.frame(cbind(x, y)) is bad - if you have a mix of numeric and character values, cbind() will convert everything to character. It's shorter and better to simplify to data.frame(x, y).
Do you want
z[] <- lapply(z, function(columnValues) columnValues + abs(min(columnValues)))
I am trying to make a single plot of the trajectory of many particles from a Brownian Motion experiment.
There are five measurements for each particle, a total of 10, for the x and y components of position.
I have the data in multiple data structures, as I am unaware of which is most useful for the end I aim to achieve.
1. All within a single data frame, with my 5 time measurements in x for the 16 particles measured, followed by the 16 for the y component.
Single data frame
In two separate dataframes, one for the x-component and one for the y.
I have tried to use rbind to create a single array that I can use geom_line() but this means I have one single line where each particle trajectory is connected to one another.
How could I go about making these different lines, all within one x-y plane. Thanks
The easiest way to achieve this is to have 3 columns, one for the common x component, one for the y, and one for the particle. To get this you'll need to convert your data to long format:
> df <- data.frame(t=c(1,2,3,4,5), x.1 = c(-1,1,3,4,5), x.2 = c(5,2,1,4,6))
> df
t x.1 x.2
1 1 -1 5
2 2 1 2
3 3 3 1
4 4 4 4
5 5 5 6
> (df <- tidyr::gather(df, "particle", "y", -t))
t particle y
1 1 x.1 -1
2 2 x.1 1
3 3 x.1 3
4 4 x.1 4
5 5 x.1 5
6 1 x.2 5
7 2 x.2 2
8 3 x.2 1
9 4 x.2 4
10 5 x.2 6
Then, use the group parameter to geom_line to plot them separately:
ggplot(df, aes(x = t, y = y)) + geom_line(aes(group = particle, color = particle))
First you have to have your data in this format
data <- data.table(particle = as.factor(rep(1:3, each = 5)),
x = sample(-10:10, 15, replace = TRUE),
y = sample(-10:10, 15, replace = TRUE))
data
particle x y
1: 1 -8 -4
2: 1 -5 -2
3: 1 -1 -5
4: 1 -3 9
5: 1 4 -7
6: 2 2 1
7: 2 -8 -10
8: 2 -4 -8
9: 2 -6 -4
10: 2 -8 -3
11: 3 -10 10
12: 3 6 -5
13: 3 -5 -6
14: 3 -6 8
15: 3 1 -4
One column for identifying the particle and the other for the position in coordinates.
This link might help you changing your data: http://www.cookbook-r.com/Manipulating_data/Converting_data_between_wide_and_long_format/
Then just plot grouping by particle (using color aes)
ggplot(data = data,
aes(x = x, y = y, color = particle)) +
geom_path(size = 3)
If you want to change the order of the path, just add a column of time and sort the df by that column.
I have a horizontally oriented matrix with x = time and y = stocks' returns. I'd like to plot it with rCharts to make it interactive but I can't find HOW anywhere...
matrix is like:
matTest <- as.data.frame(matrix(rnorm(100,0,1), nrow = 5, ncol = 10))
colnames(matTest) <- c('t0','t1','t2','t3','t4','t5','t6','t7','t8','t9')
rownames(matTest) <- c('stock1','stock2','stock3', 'stock4','stock5')
do you know how can I do that?
Thank you very much
If you need an interactive table, you can use this code on your original data.
library(DT)
datatable(matTest, options = list(pageLength = 9))
If you want an interactive time_series plot, first of all change the format of your data in this way:
df<-as.data.frame(cbind(as.matrix(as.vector(t(matTest))),c(1:ncol(matTest)-1),unlist(lapply(rownames(matTest),rep,times=ncol(matTest)))))
colnames(df)<-c("time_series","time","stock")
df
time_series time stock
1 -0.813688587253615 0 1
2 -0.457763419325742 1 1
3 0.0756429812511287 2 1
4 2.18700453503453 3 1
5 1.00659661717065 4 1
6 -2.16436341755656 5 1
7 -0.0829999360152501 6 1
8 -0.491237208736282 7 1
9 0.351591891565934 8 1
10 0.138073915553248 9 1
11 0.276431050047784 0 2
12 -0.88208290628419 1 2
13 0.421498167781597 2 2
...
Now use rCharts to plot yout time_series
library("rCharts")
xPlot(time_series~time, group="stock",data=df,type="line-dotted")
Now you can change the plot's parameters to have the best outfit.
I have a problem that I suspect has arisen from a dplyr update combined with my hacky code. Given a data frame in which every row is duplicated, I want to assign each row a unique id by combining the entries of two columns with either "_" or "a_" in the middle. I also want to assign a group id by combining the entries of one column with either "" or "a". Because these formats are important for lining up with another data frame, I can't use solutions based on interact and factor that I've seen in other posts.
So I want to go from this:
Generation Identity
1 1 X
2 1 Y
3 1 Z
4 2 X
5 2 Y
6 2 Z
7 3 X
8 3 Y
9 3 Z
10 1 X
11 1 Y
12 1 Z
13 2 X
14 2 Y
15 2 Z
16 3 X
17 3 Y
18 3 Z
to this:
Generation Identity Unique_id Group_id
1 1 X 1_X X
2 1 Y 1_Y Y
3 1 Z 1_Z Z
4 2 X 2_X X
5 2 Y 2_Y Y
6 2 Z 2_Z Z
7 3 X 3_X X
8 3 Y 3_Y Y
9 3 Z 3_Z Z
10 1 X 1a_X Xa
11 1 Y 1a_Y Ya
12 1 Z 1a_Z Za
13 2 X 2a_X Xa
14 2 Y 2a_Y Ya
15 2 Z 2a_Z Za
16 3 X 3a_X Xa
17 3 Y 3a_Y Ya
18 3 Z 3a_Z Za
The minimal example below is based on code that previously worked for me and others in setting the unique id but that now causes RStudio to crash with a seg fault (Exception Type: EXC_BAD_ACCESS (SIGSEGV)). When I call a function containing this code it generates the message
Error in match(vector, df$Unique_id) : 'translateCharUTF8' must be
called on a CHARSXP
which I've read can be symptomatic of memory issues.
library(dplyr)
dff <- data.frame(Generation = rep(1:3, each = 3),
Identity = rep(LETTERS[24:26], times = 3))
dff <- rbind(dff, dff) # duplicate rows
dff <- group_by_(dff, ~Generation, ~Identity) %>%
mutate(Unique_id = c(paste0(Identity[1], "_", Generation[1]), paste0(Identity[1], "a", "_", Generation[1]))) %>%
ungroup
I think the problem is related to an update of dplyr (I'm using the latest release versions of RStudio and all packages, on OSX Sierra). In any case, my solution above is something of a hack. I'd very much appreciate suggestions for improved code, preferably using either base R or dplyr (since the code is part of a package that currently depends on dplyr).
Here is how you can approach the problem:
First find the duplicates of your data. I called my data A
dup=duplicated(A)
Then add a counter row:
A$count=1:nrow(A)
n=ncol(A)#THE COLUMN ADDED
now obtain the two columns needed and cbind it with the original dataframe:
B=data.frame(t(apply(A,1,function(x)
if(dup[as.numeric(x[n])]) c(paste0(x["Identity"],"a"),paste(x[-n],collapse="a_"))
else c(x["Identity"],paste(x[-n],collapse="_")))))
`names<-`(cbind(A[-n],B),c(names(A[-1]),"Group_ID","Unique_ID"))
Identity count Group_ID Unique_ID
1 1 X X 1_X
2 1 Y Y 1_Y
3 1 Z Z 1_Z
4 2 X X 2_X
5 2 Y Y 2_Y
6 2 Z Z 2_Z
7 3 X X 3_X
8 3 Y Y 3_Y
9 3 Z Z 3_Z
10 1 X Xa 1a_X
11 1 Y Ya 1a_Y
12 1 Z Za 1a_Z
13 2 X Xa 2a_X
14 2 Y Ya 2a_Y
15 2 Z Za 2a_Z
16 3 X Xa 3a_X
17 3 Y Ya 3a_Y
18 3 Z Za 3a_Z
Here's my amended version of Onyambu's solution, which refers to columns by name rather than number (and so can handle data frames that have additional columns):
dup <- duplicated(dff) # identify duplicates
dff$count <- 1:nrow(dff) # add count column to the dataframe
# create a new dataframe containing the unique and group ids:
B <- data.frame(t(apply(dff, 1, function(x)
if(dup[as.numeric(x["count"])]) c(paste0(x["Identity"], "a"),
paste(x["Identity"], x["Generation"], sep = "a_"))
else c(x["Identity"], paste(x["Identity"], x["Generation"], sep = "_")))))
# combine the dataframes:
colnames(B) <- c("Group_id", "Unique_id")
dff <- cbind(dff[-ncol(dff), B)
How do I create a new column whose formula depends on a cell value of another row
x y z
1 a 1 10
2 a 2 20
3 a 3 30
4 b 1 40
This is my sample data. I want the final output to be as follows
x y z prevY
1 a 1 10 0
2 a 2 20 10
3 a 3 30 20
4 b 1 40 0
where prevY is the z value for x=current_x_val and y=current_y_val-1 0 if not available.
How do I achieve this.
My progress so far :
data[data$x == "a" & data$y==2-1,3]
I manually enter the values and get the prevY for each row. but how do i do it for all rows in a single shot ?
Or data.table solution (similar to MrFlick) but faster for a big data set
library(data.table)
setDT(dat)[, prevY := c(0, z[-length(z)]), by = x]
Here you can use the ave() function for doing group level transformations (here, a different transformation for each value of x).
dd$prevY <- with(dd, ave(z, x, FUN=function(x) head(c(0,x),-1)))
Here we take the values of z for each value of x and add a zero on the front and remove the last value. Then we assign this back to the data.frame.
This assumes that all the y values are sorted within each x group.
The result is
x y z prevY
1 a 1 10 0
2 a 2 20 10
3 a 3 30 20
4 b 1 40 0