R: Positioning labels and axes with rgl.plot3d - r

I'm trying to create a 3d scatter plot using rgl.plot3d. However, the default positioning of the labels and axes is not satisfactory. E.g., the y-axis label is positioned on the far side, while I want it to be positioned on the near side. The x-axis ticks are positioned at the far top. I went them to be positioned at the near bottom. I looked at ?par3dbut couldn't find anything that would help me. Is it possible to do this in rgl? Code and data are given below. Thank you.
Code
d <- read.table(file='myfile.dat', header=F)
plot3d(
d,
xlim=c(0,20),
ylim=c(0,20),
zlim=c(0,10000),
box=F,
type='p',
size=5,
col=d[,1]
)
mtext3d(text='Test', edge='y+-', line=2)
axes3d(
edges=c('x--', 'y+-', 'z--'),
labels=T
)
lines3d(
d,
lwd=2,
col=d[,1]
)
grid3d(side=c('x', 'y+', 'z'))
Data
11 2 2
NA NA NA
10 2 2
NA NA NA
13 2 1
NA NA NA
15 2 1
NA NA NA
5 2 11
5 3 10
5 4 16
5 5 34
5 6 102
5 7 294
5 8 682
5 9 1439
5 10 2646
5 11 3615
5 12 2844
5 13 1394
NA NA NA
4 2 10
4 3 4
4 4 4
4 5 10
4 6 38
4 7 132
4 8 396
4 9 976
4 10 2121
4 11 4085
4 12 6261
4 13 6459
4 14 4238
4 15 1394
NA NA NA
7 2 3
NA NA NA
6 2 2
NA NA NA
9 2 8
9 3 6
9 4 4
9 5 5
NA NA NA
8 2 4
8 3 10
8 4 22
8 5 52
8 6 126
8 7 264
8 8 478
8 9 729
8 10 943
8 11 754
8 12 382
NA NA NA

You need to look at ?axis3d where the use of the 'edges' parameter is described. If you want the x-axis tick labels at the front-bottom and the y-axis on the near+bottom side, you would first build the plot using ..., axes=FALSE, and with the focus unchanged issue this command at the console:
axes3d( edges=c("x--", "y--", "z") )
I have not yet figured out whether it is possible to remove an existing axis in an rgl plot.

Related

Error with repolr: Error in rowSums(t(mapply(complete.cases, split.data))) : 'x' must be numeric

I am trying to analyse a dataset in R.
The data set looks something like this:
ID visit REL_3 LOAN ee dp pa exer alcohol e p d
1 2 1 4 2 44 12 32 122.0 8 2 0 2
2 2 2 4 2 44 48 75 78.5 8 2 2 2
3 2 3 4 1 26 17 49 222.5 8 1 2 2
4 2 4 NA NA NA NA NA NA NA NA NA NA
5 3 1 4 6 27 13 48 78.0 44 2 2 2
6 3 2 4 6 46 13 37 49.0 38 2 1 2
Except for ID and visit, all the variables are numeric.
However when I try to fit:
repolr(e~REL_3LOANalcohol*exer, data=mss, categories=3,subjects = "ID",times = c(1,2,3,4),corr.mod = "ar1",alpha=0.5)
I get Error
Error in rowSums(t(mapply(complete.cases, split.data))) : 'x' must be numeric
I would appreciate some help on this if possible. I had to add NAs because the repolr package requires entries for all the patients for all the visits even if there is no data for that particular visit and subject.
I dont know how to proceed. I would really appreciate some help on this.
Regards,
Shalom

How to treat with empty values contained in columns of data-set using r programming language?

I have learned imputation of NA values in r, we normally find the average (if it is numeric) of the data and put that in NA place of particular column. But i wanna ask that what should i do if instead of NA, the place is empty i.e. the cell is empty of any column.
Please help me.
Let's start with some test data:
person_id <- c("1","2","3","4","5","6","7","8","9","10")
inches <- as.numeric(c("56","58","60","62","64","","68","70","72","74"))
height <- data.frame(person_id,inches)
height
person_id inches
1 1 56
2 2 58
3 3 60
4 4 62
5 5 64
6 6 NA
7 7 68
8 8 70
9 9 72
10 10 74
The blank was already replaced with NA in height$inches.
You could also do this yourself:
height$inches[height$inches==""] <- NA
Now to fill in the NA with the average from the non-missing values of inches.
options(digits=4)
height$inches[is.na(height$inches)] <- mean(height$inches,na.rm=T)
height
person_id inches
1 1 56.00
2 2 58.00
3 3 60.00
4 4 62.00
5 5 64.00
6 6 64.89
7 7 68.00
8 8 70.00
9 9 72.00
10 10 74.00

Subsetting multiple variables in one column in r

I only have basic knowledge of R and i hope you can help me with my problem and its not a too stupid question for you ;-)
I have a dataset called "rope". It looks like the following :
head(rope)
X...Sound Time.real. Time.in.Video. Observations
1 5_min_blank 10:18 03:59 (2) 2
2 5_min_blank NA
3 Fisch1 10:23 08:59 6
4 Fisch1 NA
5 Fisch1 NA
6 Fisch1 NA
Observation.total.time Time.of.the.shark.in.the.video
1 60 23
2 37
3 157 17
4 46
5 37
6 28
Time.of.the.shark.entering.the.video
1 04:03
2 04:20
3 08:49
4 09:06
5 09:23
6 10:21
Time.of.the.shark.leaving.the.video
1 04:26
2 04:57
3 09:05
4 09:52
5 10:00
6 10:49
times.the.shark.turns.to.the.speaker directional.change
1 1 5
2 2 11
3 1 1
4 4 6
5 3 6
6 2 7
flap.of.the.fins..fotf. flap.of.the.fins..second corrected.fotf.s
1 14 0,608695652 0.7777778
2 14 0,378378378 0.5600000
3 0 NA
4 30 0,652173913 0.6818182
5 0 0 NA
6 15 0,535714286 0.6521739
Notes complete.cyrcles swims.below.b..above.a..speaker
1 1 NA
2 NA
3 NA
4 2 NA
5 NA
6 NA
Swimming.patterns date X
1 3 21.07.17 NA
2 9 21.07.17 NA
3 NA 21.07.17 NA
4 9 21.07.17 NA
5 4 21.07.17 NA
6 4 21.07.17 NA
Now i have different sounds. The first sound is the "Fish1" but i also have "Fish2" and "Diving" for example. Furthermore are between the sounds the corresponding pauses they are called "Fish1_pause", "Fish2_pause" or "Diving_pause" etc.
Now i would like to subset my data into the sound data points and the "pause" data points.
I tried:
sound<-subset(rope, rope$X...Sound=="Fish1"& rope$X...Sound=="Fish2")
but i got no datapoint at all... if i only type :
sound<-subset(rope, rope$X...Sound=="Fish1")
I receive all datapoints were i have the Fish1 sound.
My question now is how can i get all sound points?
Because with the "&" it didn't work... i hope you understand my problem and you can help me.
Thank you very much and all the best
Jessi
sound<-subset(rope, rope$X...Sound=="Fish1"& rope$X...Sound=="Fish2")
should be replaced by either
sound<-subset(rope, rope$X...Sound == "Fish1" | rope$X...Sound == "Fish2")
or
sound<-subset(rope, rope$X...Sound %in% c("Fish1","Fish2"))
As it is, you are asking for observations where X...Sound is simultaneously "Fish1" and "Fish2" -- which is impossible.

Aggregation of all possible unique combinations with observations in the same column in R

I am trying to shorten a chunk of code to make it faster and easier to modify. This is a short example of my data.
order obs year var1 var2 var3
1 3 1 1 32 588 NA
2 4 1 2 33 689 2385
3 5 1 3 NA 678 2369
4 33 3 1 10 214 1274
5 34 3 2 10 237 1345
6 35 3 3 10 242 1393
7 78 6 1 5 62 NA
8 79 6 2 5 75 296
9 80 6 3 5 76 500
10 93 7 1 NA NA NA
11 94 7 2 4 86 247
12 95 7 3 3 54 207
Basically, what I want is R to find any possible and unique combination of two values (observations) in column "obs", within the same year, to create a new matrix or DF with observations being the aggregation of the originals. Order is not important, so 1+6 = 6+1. For instance, having 150 observations, I will expect 11,175 feasible combinations (each year).
I sort of got what I want with basic coding but, as you will see, is way too long (I have built this way 66 different new data sets so it does not really make a sense) and I am wondering how to shorten it. I did some trials (plyr,...) with no real success. Here what I did:
# For the 1st year, groups of 2 obs
newmatrix <- data.frame(t(combn(unique(data$obs[data$year==1]), 2)))
colnames(newmatrix) <- c("obs1", "obs2")
newmatrix$name <- do.call(paste, c(newmatrix[c("obs1", "obs2")], sep = "_"))
# and the aggregation of var. using indexes, which I will skip here to save your time :)
To ilustrate, here the result, considering above sample, of what I would get for the 1st year. NA is because I only computed those where the 2 values were valid. And only for variables 1 and 3. More, I did the sum but it could be any other possible Function:
order obs1 obs2 year var1 var3
1 1 1 3 1_3 42 NA
2 2 1 6 1_6 37 NA
3 3 1 7 1_7 NA NA
4 4 3 6 3_6 15 NA
5 5 3 7 3_7 NA NA
6 6 6 7 6_7 NA NA
As for the 2 first lines in the 3rd year, same type of matrix:
order obs1 obs2 year var1 var3
1 1 1 3 1_3 NA 3762
2 2 1 6 1_6 NA 2868
.......... etc ............
I hope I explained myself. Thank you in advance for your hints on how to do this more efficient.
I would use split-apply-combine to split by year, find all the combinations, and then combine back together:
do.call(rbind, lapply(split(data, data$year), function(x) {
p <- combn(nrow(x), 2)
data.frame(order=paste(x$order[p[1,]], x$order[p[2,]], sep="_"),
obs1=x$obs[p[1,]],
obs2=x$obs[p[2,]],
year=x$year[1],
var1=x$var1[p[1,]] + x$var1[p[2,]],
var2=x$var2[p[1,]] + x$var2[p[2,]],
var3=x$var3[p[1,]] + x$var3[p[2,]])
}))
# order obs1 obs2 year var1 var2 var3
# 1.1 3_33 1 3 1 42 802 NA
# 1.2 3_78 1 6 1 37 650 NA
# 1.3 3_93 1 7 1 NA NA NA
# 1.4 33_78 3 6 1 15 276 NA
# 1.5 33_93 3 7 1 NA NA NA
# 1.6 78_93 6 7 1 NA NA NA
# 2.1 4_34 1 3 2 43 926 3730
# 2.2 4_79 1 6 2 38 764 2681
# 2.3 4_94 1 7 2 37 775 2632
# 2.4 34_79 3 6 2 15 312 1641
# 2.5 34_94 3 7 2 14 323 1592
# 2.6 79_94 6 7 2 9 161 543
# 3.1 5_35 1 3 3 NA 920 3762
# 3.2 5_80 1 6 3 NA 754 2869
# 3.3 5_95 1 7 3 NA 732 2576
# 3.4 35_80 3 6 3 15 318 1893
# 3.5 35_95 3 7 3 13 296 1600
# 3.6 80_95 6 7 3 8 130 707
This enables you to be very flexible in how you combine data pairs of observations within a year --- x[p[1,],] represents the year-specific data for the first element in each pair and x[p[2,],] represents the year-specific data for the second element in each pair. You can return a year-specific data frame with any combination of data for the pairs, and the year-specific data frames are combined into a single final data frame with do.call and rbind.

R: xlim, ylim and zlim not working for rgl.plot3d

I'm trying to create a 3d scatter plot using the following script:
d <- read.table(file='myfile.dat', header=F)
plot3d(
d,
xlim=c(0,20),
ylim=c(0,20),
zlim=c(0,10000),
xlab='Frequency',
ylab='Size',
zlab='Number of subgraphs',
box=F,
type='s',
size=0.5,
col=d[,1]
)
lines3d(
d,
xlim=c(2,20),
ylim=c(0,20),
zlim=c(0,10000),
lwd=2,
col=d[,1]
)
grid3d(side=c('x', 'y+', 'z'))
Now for some reason, R is ignoring the range limits I've specified and is using arbitrary values, messing up my plot. I get no error when I run the script. Does anybody have any idea what's wrong? If required, I can also post an image of the plot that is created. The data file is given below:
myfile.dat
11 2 2
NA NA NA
10 2 2
NA NA NA
13 2 1
NA NA NA
15 2 1
NA NA NA
5 2 11
5 3 10
5 4 16
5 5 34
5 6 102
5 7 294
5 8 682
5 9 1439
5 10 2646
5 11 3615
5 12 2844
5 13 1394
NA NA NA
4 2 10
4 3 4
4 4 4
4 5 10
4 6 38
4 7 132
4 8 396
4 9 976
4 10 2121
4 11 4085
4 12 6261
4 13 6459
4 14 4238
4 15 1394
NA NA NA
7 2 3
NA NA NA
6 2 2
NA NA NA
9 2 8
9 3 6
9 4 4
9 5 5
NA NA NA
8 2 4
8 3 10
8 4 22
8 5 52
8 6 126
8 7 264
8 8 478
8 9 729
8 10 943
8 11 754
8 12 382
NA NA NA
The help page, ?plot3d says "Note that since rgl does not currently support clipping, all points will be plotted, and 'xlim', 'ylim', and 'zlim' will only be used to increase the respective ranges." So you need to restrict the data in the input stage. (And you will need to use segments3d instead of lines3d if you only want particular ranges that are inside the plotted volume.)
d2 <- subset(d, d[,1]>0 & d[,1] <20 & d[,2]>0 & d[,2] <20 & d[,3]>0 & d[,3]<10000 ])
plot3d(
d2[, 1:3], # You can probably use something more meaningful,
xlim=c(0,20),
ylim=c(0,20),
zlim=c(0,10000),
xlab='Frequency',
ylab='Size',
zlab='Number of subgraphs',
box=F,
type='s',
size=0.5,
col=d[,1]
)
(I did notice that when the range was c(0,10000) that the size of the points was pretty much invisible. and further experimentation suggest that the great disparity in ranges is going to cause furhter difficulties in keeping the ranges at 0 on the low side if you increase the size to the point where it is visible. If you make the points really big , they expand the range to accommodate the overlap beyond the x=0 or y=0 planes.)
As DWin said, lines3d does not handle *lim arguments. From the help page, "... Material properties (see rgl.material), normals and texture coordinates (see rgl.primitive)."
So use some other function, or perhaps you could retrieve the existing limits from your plot3d call and use those to scale your data prior to plotting?

Resources