R plot function gives out weird answer

R plot function gives out weird answer - r

I tried to use R plot to get the curve for some functions. Sometimes, I got very weird results. Here is an example
u=c(2,2,2,2,2)
elas=function(P){
prob=P/sum(u,P)
return(prob)
}
plot(elas,0,6)
This code gives out the plot like this:
It is obviously not right. The right code should be like this:
I know that if I change the 3rd line of the code to be
prob=P/(sum(u)+P)
It would work. But I do not understand why my original code does not work. Does it mean that I cannot plot a function which embeds another function?

sum(u,P) is a single value equal to the sum of all the values in u and P. So in elas, every value of P get's divided by the same number (313 in your example).
sum(u) + P is a vector containing each individual value of P with sum(u) added to it. So in the second version of elas (which I've called elas2 below), P/(sum(u) + P) results in element-by-element division of P by sum(u) + P.
Consider the examples below.
u=c(2,2,2,2,2)
x=seq(0,6,length=101)
sum(u,x)
[1] 313
sum(u) + x
[1] 10.00 10.06 10.12 10.18 10.24 10.30 10.36 10.42 10.48 10.54 10.60 10.66 10.72 10.78
[15] 10.84 10.90 10.96 11.02 11.08 11.14 11.20 11.26 11.32 11.38 11.44 11.50 11.56 11.62
[29] 11.68 11.74 11.80 11.86 11.92 11.98 12.04 12.10 12.16 12.22 12.28 12.34 12.40 12.46
[43] 12.52 12.58 12.64 12.70 12.76 12.82 12.88 12.94 13.00 13.06 13.12 13.18 13.24 13.30
[57] 13.36 13.42 13.48 13.54 13.60 13.66 13.72 13.78 13.84 13.90 13.96 14.02 14.08 14.14
[71] 14.20 14.26 14.32 14.38 14.44 14.50 14.56 14.62 14.68 14.74 14.80 14.86 14.92 14.98
[85] 15.04 15.10 15.16 15.22 15.28 15.34 15.40 15.46 15.52 15.58 15.64 15.70 15.76 15.82
[99] 15.88 15.94 16.00
par(mfrow=c(1,3))
elas=function(P) {
P/sum(u,P)
}
dat = data.frame(x, y=elas(x), y_calc=x/sum(u,x))
plot(dat$x, dat$y, type="l", lwd=2, ylim=c(0,0.020))
plot(elas, 0, 6, lwd=2, ylim=c(0,0.020))
curve(elas, 0, 6, lwd=2, ylim=c(0,0.020))
dat$y - dat$y_calc
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[43] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[85] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
elas2 = function(P) {
P/(sum(u) + P)
}
dat$y2 = elas2(x)
plot(dat$x, dat$y2, type="l", lwd=2, ylim=c(0,0.4))
plot(elas2, 0, 6, lwd=2, ylim=c(0,0.4))
curve(elas2, 0, 6, lwd=2, ylim=c(0,0.4))

sum(u+P) = sum of each value in u plus P
sum(u) + P = sum of values in u plus P.
Example: u = c(1,2,3), P = 5
sum(u+P) = (1+5) + (2+5) + (3+5) = 6+7+8 = 21
sum(u) + P = (1+2+3) + 5 = 6 + 5 = 11
For
elas <- function(P){
prob=P/(sum(u+P)
return(prob)
}
with u <- c(2,2,2,2,2):
y <- elas(0:6)
print(y)
#output of print y:
0.00000000 0.03225806 0.06451613 0.09677419 0.12903226 0.16129032
0.19354839
plot(0:6,y)
For
elas <- function(P){
prob=P/(sum(u) + P)
return(prob)
}
y <- elas(0:6)
print(y)
#Output of print(y)
plot(0:6,y)
0.00000000 0.09090909 0.16666667 0.23076923 0.28571429 0.33333333 0.37500000

Related

get the name of child list in a list in R with lapply function

How can I get the name of child list in a list in R? My list is like:
$sd1
freq value order
11 1.15 17 0
12 2.12 13 0
13 2.81 21 0
14 4.13 15 0
15 4.84 18 0
16 7.54 59 0
17 9.36 17 0
$sd2
freq value order
31 0.63 4 0
32 1.54 3 0
33 3.22 3 0
34 3.98 4 0
35 4.66 38 0
36 7.14 3 0
37 9.39 29 0
$sd3
freq value order
41 0.97 4 0
42 2.03 7 0
43 2.65 4 0
44 3.34 680 0
45 4.15 4 0
46 6.67 10 0
47 7.51 6 0
48 8.35 4 0
49 10.57 4 0
50 15.97 6 0
I'd like to get sd1,sd2... with lapply function and make some changes on each child list of sd1, sd2, etc.

R variable evaluates differently depending on context - in loop or not

EDIT: now with reproducible code/data.
I am trying to run chi-squared tests on multiple variables in my dataframe.
Using the npk dataset:
A single variable, N producing the proper result.
npk %>%
group_by(yield, N) %>%
select(yield, N) %>%
table() %>%
print() %>%
chisq.test()
As you can see the output of table() is in a form that chisq.test() can utilize.
N
yield 0 1
44.2 1 0
45.5 1 0
46.8 1 0
48.8 1 1
49.5 1 0
49.8 0 1
51.5 1 0
52 0 1
53.2 1 0
55 1 0
55.5 1 0
55.8 0 1
56 2 0
57 0 1
57.2 0 1
58.5 0 1
59 0 1
59.8 0 1
62 0 1
62.8 1 1
69.5 0 1
Pearson's Chi-squared test
data: .
X-squared = 20, df = 20, p-value = 0.4579
When I try and do multiple tests using a loop something about calling on the particular variable changes the output of my table and the chi-squared test cannot run.
Create the list that the loop runs through:
test_ordinal_variables <- noquote(names(npk[2:4]))
test_ordinal_variables
The loop with the errorcode: (1:1 for clarity, error is repeated if you use 1:3)
for (i in 1:1){
npk %>%
group_by(yield, test_ordinal_variables[i]) %>%
select(yield, test_ordinal_variables[i]) %>%
table() %>%
print() %>%
chisq.test()
}
The output clearly showing the table that chisq.test() cannot interpret:
Adding missing grouping variables: `test_ordinal_variables[i]`
, , N = 0
yield
test_ordinal_variables[i] 44.2 45.5 46.8 48.8 49.5 49.8 51.5 52 53.2 55 55.5 55.8 56 57 57.2 58.5 59 59.8 62
N 1 1 1 1 1 0 1 0 1 1 1 0 2 0 0 0 0 0 0
yield
test_ordinal_variables[i] 62.8 69.5
N 1 0
, , N = 1
yield
test_ordinal_variables[i] 44.2 45.5 46.8 48.8 49.5 49.8 51.5 52 53.2 55 55.5 55.8 56 57 57.2 58.5 59 59.8 62
N 0 0 0 1 0 1 0 1 0 0 0 1 0 1 1 1 1 1 1
yield
test_ordinal_variables[i] 62.8 69.5
N 1 1
For some reason test_ordinal_variables[i] is not evaluating perfectly to what I would expect when it is in the loop. You can see as the error claimed that it is "Adding missing grouping variables", but if it just evaluated the expression rather than adding a variable then I think it would work.
This evaluates on its own as I would expect.
> test_ordinal_variables[1]
[1] N
So why won't it do the same when it is in the loop?

Since you are passing a dynamic, quoted variable into a dplyr chained method consider the group_by_() and select_() underscore counterpart versions. And since yield is not being dynamically passed, convert it to a symbol() to be processed.
for (i in names(npk[2:4])){
npk %>%
group_by_(as.symbol("yield"), i) %>%
select_(as.symbol("yield"), i) %>%
table() %>%
print() %>%
chisq.test() %>%
print()
}
Output
N
yield 0 1
44.2 1 0
45.5 1 0
46.8 1 0
48.8 1 1
49.5 1 0
49.8 0 1
51.5 1 0
52 0 1
53.2 1 0
55 1 0
55.5 1 0
55.8 0 1
56 2 0
57 0 1
57.2 0 1
58.5 0 1
59 0 1
59.8 0 1
62 0 1
62.8 1 1
69.5 0 1
Pearson's Chi-squared test
data: .
X-squared = 20, df = 20, p-value = 0.4579
P
yield 0 1
44.2 0 1
45.5 1 0
46.8 1 0
48.8 0 2
49.5 0 1
49.8 1 0
51.5 1 0
52 0 1
53.2 0 1
55 1 0
55.5 1 0
55.8 0 1
56 1 1
57 1 0
57.2 1 0
58.5 0 1
59 0 1
59.8 1 0
62 1 0
62.8 0 2
69.5 1 0
Pearson's Chi-squared test
data: .
X-squared = 22, df = 20, p-value = 0.3405
K
yield 0 1
44.2 1 0
45.5 0 1
46.8 1 0
48.8 0 2
49.5 0 1
49.8 0 1
51.5 1 0
52 1 0
53.2 0 1
55 0 1
55.5 0 1
55.8 0 1
56 2 0
57 0 1
57.2 0 1
58.5 0 1
59 1 0
59.8 1 0
62 1 0
62.8 2 0
69.5 1 0
Pearson's Chi-squared test
data: .
X-squared = 24, df = 20, p-value = 0.2424
Warning messages:
1: In chisq.test(.) : Chi-squared approximation may be incorrect
2: In chisq.test(.) : Chi-squared approximation may be incorrect
3: In chisq.test(.) : Chi-squared approximation may be incorrect

How can I get the recapture probabilities in R (which package to use) ?

I'm trying to find a way to estimate the recapture probabilities in my data. Here is an example directly from the package FSA in R.
library(FSA)
## First example -- capture histories summarized with capHistSum()
data(CutthroatAL)
ch1 <- capHistSum(CutthroatAL,cols2use=-1) # ignore first column of fish ID
ex1 <- mrOpen(ch1)
summary(ex1)
summary(ex1,verbose=TRUE)
confint(ex1)
confint(ex1,verbose=TRUE)
If you type summary(ex1,verbose=TRUE), you'll have this result
# Observables:
# m n R r z
# i=1 0 89 89 26 NA
# i=2 22 352 352 96 4
# i=3 94 292 292 51 6
# i=4 41 233 233 46 16
# i=5 58 259 259 100 4
# i=6 99 370 370 99 5
# i=7 91 290 290 44 13
# i=8 52 134 134 13 5
# i=9 18 140 0 NA NA
# Estimates (phi.se includes sampling and individual variability):
# M M.se N N.se phi phi.se B B.se
# i=1 NA NA NA NA 0.411 0.088 NA NA
# i=2 36.6 6.4 561.1 117.9 0.349 0.045 198.6 48.2
# i=3 127.8 13.4 394.2 44.2 0.370 0.071 526.3 119.7
# i=4 120.7 20.8 672.2 138.8 0.218 0.031 154.1 30.2
# i=5 68.3 4.1 301.0 21.8 0.437 0.041 304.7 25.4
# i=6 117.5 7.3 436.1 30.3 0.451 0.069 357.2 61.2
# i=7 175.1 24.6 553.7 84.3 0.268 0.072 106.9 36.2
# i=8 100.2 24.7 255.3 65.4 NA NA NA NA
# i=9 NA NA NA NA NA NA NA NA
Since, "Observables" is not in a list, I cannot extract automatically the numbers. Is it possible?
I have the same type of dataset, but the output won't show me a probability of recapture. I have an open population. That's why I try to use this package.
Here's a look of the typical dataset:
head(CutthroatAL)
# id y1998 y1999 y2000 y2001 y2002 y2003 y2004 y2005 y2006
# 1 1 0 0 0 0 0 0 0 0 1
# 2 2 0 0 0 0 0 0 0 0 1
# 3 3 0 0 0 0 0 0 0 0 1
# 4 4 0 0 0 0 0 0 0 0 1
# 5 5 0 0 0 0 0 0 0 0 1
# 6 6 0 0 0 0 0 0 0 0 1
I also tried the package mra and its F.cjs.estim() function. But, I don't have survival information...
I haven't find any function in RCapture that allows me to print a capture probability.
I'm trying to find the information pj on page 38 of this book Handbook of Capture-Recapture Analysis.
I haven't found as well in the RMark package.
So how can I estimate recapture probabilities in R?
Thanks,

If you just want to capture the "Observable" values in the summary, you can do it the same way the function does. If you look at the source for FSA:::summary.mrOpen, you can see that you can grab those values with
ex1$df[, c("m", "n", "R", "r", "z")]

Organizing three dimensional data from table into matrix/array form using R

I have a table that looks similar to this
MUNI YEAR ENTE SALE
D101 1995 F001 1000
D101 1995 F002 1200
D101 1995 F003 1300
D101 1996 F001 1000
D101 1996 F003 1250
D101 1996 F004 1300
D101 1997 F001 1000
D101 1998 F002 1400
D101 1998 F003 1500
D102 1995 F001 1000
D102 1995 F003 1200
D102 1995 F006 1300
D102 1996 F001 1050
D102 1996 F002 1320
D102 1996 F003 1250
D102 1996 F006 1350
D102 1996 F002 1320
...
It is a sales table where MUNI stands for markets and ENTE stands for firms. The data consists of 7 years, 1200 markets and 200 firms. I would like to reorganize this table into a matrix form such that the dimensions are (rows = MUNI X YEAR, Cols = ENTE) and in each cell there is the value of sale, something like this
MUNIxYEAR\ENTE F001 F002 F003 F004 ...
D101x1995 1000 1200 1300 NA ...
D101x1996 1000 NA 1250 1300 ...
...
I am not sure how to this or the best way to proceed so I get the above-mentioned data organization. I have checked other posts and I believe the way of doing this is to use the command sparseMatrix. However, I don't know how to use it when (1) you have multiple criteria (i.e., two conditions for the rows) and (2) the dimensions of the matrix are string IDs (change them into factors and the get the levels?).
Thanks in advance for any help and guidance.

Many ways and packages to do that. I'm using a "tidyr" package method:
library(tidyr)
df = data.frame(MUNI = rep(paste0("D10", c(1,1,2,2,3,4)), each = 2),
YEAR = rep(1999:2000,3),
ENTE = paste0("F00", c(1,2,3,3,4,5)),
SALE = sample(1000:2000, 6, replace = T))
df
# MUNI YEAR ENTE SALE
# 1 D101 1999 F001 1670
# 2 D101 2000 F002 1420
# 3 D101 1999 F003 1985
# 4 D101 2000 F003 1914
# 5 D102 1999 F004 1727
# 6 D102 2000 F005 1195
# 7 D102 1999 F001 1670
# 8 D102 2000 F002 1420
# 9 D103 1999 F003 1985
# 10 D103 2000 F003 1914
# 11 D104 1999 F004 1727
# 12 D104 2000 F005 1195
spread(df,ENTE,SALE, fill=0) # in case you decide to have each column separately for querying or further grouping in the future
# MUNI YEAR F001 F002 F003 F004 F005
# 1 D101 1999 1716 0 1516 0 0
# 2 D101 2000 0 1917 1155 0 0
# 3 D102 1999 1716 0 0 1259 0
# 4 D102 2000 0 1917 0 0 1291
# 5 D103 1999 0 0 1516 0 0
# 6 D103 2000 0 0 1155 0 0
# 7 D104 1999 0 0 0 1259 0
# 8 D104 2000 0 0 0 0 1291
df2 = spread(df,ENTE,SALE, fill=0)
unite(df2, "MUNIxYEAR", MUNI,YEAR, sep = " x ") # if you want to combine columns
# MUNIxYEAR F001 F002 F003 F004 F005
# 1 D101 x 1999 1716 0 1516 0 0
# 2 D101 x 2000 0 1917 1155 0 0
# 3 D102 x 1999 1716 0 0 1259 0
# 4 D102 x 2000 0 1917 0 0 1291
# 5 D103 x 1999 0 0 1516 0 0
# 6 D103 x 2000 0 0 1155 0 0
# 7 D104 x 1999 0 0 0 1259 0
# 8 D104 x 2000 0 0 0 0 1291

You can use xtabs
For instance:
# Set random seed for reproducibility
set.seed(12345)
# Generate 500 rows of random data
my.data = data.frame(MUNI = rep(paste0("D", 101:110), each = 50),
YEAR = sample(1990:2000, 500, replace = TRUE),
ENTE = sample(paste0("F00", 1:9), 500, replace = T),
SALE = sample(1000:2000, 500, replace = T)
)
# Create a new column with the string "MUNIxYEAR"
my.data$MUNIxYEAR = paste(my.data$MUNI, my.data$YEAR, sep = "x")
# Call xtabs to get the table!
res <- xtabs(SALE ~ MUNIxYEAR + ENTE, my.data)
First lines of the output:
ENTE
MUNIxYEAR F001 F002 F003 F004 F005 F006 F007 F008 F009
D101x1990 1339 0 0 1693 0 2831 2779 0 0
D101x1991 0 1407 0 3619 0 0 0 1254 0
D101x1992 0 0 0 0 1807 0 1766 0 1657
D101x1993 1174 1154 0 0 1794 0 0 1218 0
D101x1994 0 1015 6636 0 0 0 2126 0 0
D101x1995 0 0 0 0 0 3478 3228 1517 0
D101x1996 0 0 1304 0 0 0 1505 0 0
D101x1997 0 1077 1481 1802 0 2494 0 0 0
D101x1998 0 0 1660 5366 1844 0 0 1006 0
D101x1999 0 1437 0 0 0 0 1844 0 2394
D101x2000 0 0 1714 0 0 0 1950 1758 1108
D102x1990 3761 0 3307 1182 0 0 0 0 0
D102x1991 0 0 0 1539 2716 0 1716 0 0
D102x1992 1980 0 1056 1458 0 0 0 0 1641
D102x1993 0 0 1429 0 1784 0 1114 0 0
D102x1994 0 0 0 0 1377 0 1038 1000 0
D102x1995 0 0 1088 0 0 1031 4205 1764 0
D102x1996 0 0 0 0 1658 0 3559 0 0
D102x1997 0 1048 2453 0 0 1741 0 0 0
D102x1998 1427 5139 0 1336 0 0 1372 0 1395
D102x1999 0 0 0 3957 0 1972 0 0 0
D102x2000 0 3258 0 0 0 3780 0 3299 1360
D103x1990 0 0 0 1247 1526 0 0 0 1234
D103x1991 0 1919 0 0 0 0 0 1704 0
D103x1992 0 1489 0 0 4428 0 1371 0 0
D103x1993 0 1477 0 0 0 0 1319 0 1211
D103x1994 0 2649 0 0 1488 0 0 0 0

The xtabs function can help reformat your data into a 3 dimensional array and then the ftable function can flatten it to the 2 dimensional table.
Other options would be the reshape2 or plyr packages (and probably others as well).

Reformat data tables based on row names to generate new columns in R

I have a data frame that looks like
m1 m2 m3
P001.st 60.00 2.0 1
P003.nd 14.30 2.077 1
P003.rt 29.60 2.077 1
P006.st 10.30 2.077 1
P006.nd 79.30 2.077 1
P008.nd 9.16 2.077 1
I want to reformat table so that only first part (before period, i.e., P001, P003 etc) of the row name appear as row names and append the each subsequent rows with similar names to columns. The output should look like
m1st m2st m3st m1nd m2nd m3nd m1rt m2rt m3rt
P001 60.00 2.0 1 0 0 0 0 0 0
P003 0 0 0 14.30 2.077 1 29.60 2.077 1
P006 10.30 2.077 1 79.30 2.077 1 0 0 0
P008 0 0 0 9.16 2.077 1 0 0 0
The aggregate function like
aggregate(value~name, df, I)
or a method from data.table like
setDT(df)[, list(value=list(value)), by=name]
would not work because row.names are not exactly the same. Any suggestions for matching hundreds of rows with many variable subtypes (i.e, after period: .nd, .st etc).

dt = as.data.table(your_df, keep.rownames = T)
# split the row names into two id's
dt[, `:=`(id1 = sub('\\..*', '', rn), id2 = sub('.*\\.', '', rn), rn = NULL)]
# melt and dcast (need latest 1.9.5 or have to load reshape2 and use dcast.data.table)
dcast(melt(dt, id.vars = c('id1', 'id2')), id1 ~ variable + id2, fill = 0)
# id1 m1_nd m1_rt m1_st m2_nd m2_rt m2_st m3_nd m3_rt m3_st
#1: P001 0.00 0.0 60.0 0.000 0.000 2.000 0 0 1
#2: P003 14.30 29.6 0.0 2.077 2.077 0.000 1 1 0
#3: P006 79.30 0.0 10.3 2.077 0.000 2.077 1 0 1
#4: P008 9.16 0.0 0.0 2.077 0.000 0.000 1 0 0

Here's another way to do it:
library(dplyr)
library(tidyr)
(wide <- reshape(df %>% add_rownames() %>% separate(rowname, c("rowname", "id")),
idvar = "rowname",
timevar = "id",
direction = "wide",
sep = ""))
# rowname m1st m2st m3st m1nd m2nd m3nd m1rt m2rt m3rt
# 1 P001 60.0 2.000 1 NA NA NA NA NA NA
# 2 P003 NA NA NA 14.30 2.077 1 29.6 2.077 1
# 4 P006 10.3 2.077 1 79.30 2.077 1 NA NA NA
# 6 P008 NA NA NA 9.16 2.077 1 NA NA NA
wide[is.na(wide)] <- 0
rownames(wide) <- wide[, 1]
wide$rowname <- NULL
wide
# m1st m2st m3st m1nd m2nd m3nd m1rt m2rt m3rt
# P001 60.0 2.000 1 0.00 0.000 0 0.0 0.000 0
# P003 0.0 0.000 0 14.30 2.077 1 29.6 2.077 1
# P006 10.3 2.077 1 79.30 2.077 1 0.0 0.000 0
# P008 0.0 0.000 0 9.16 2.077 1 0.0 0.000 0

If you have your data frame is called "data":
library(reshape2)
data$prefix <- gsub("(.*)\\..*","\\1",row.names(data))
data$suffix <- gsub(".*\\.(.*)","\\1",row.names(data))
data.melt <- melt(data)
data.melt
data.cast <- dcast(data.melt,prefix~variable+suffix,mean)
# set the row names to prefix
row.names(data.cast) <- data.cast$prefix
# get rid of the prefix column
data.cast <- data.cast[,-1]
data.cast
Gives
Using prefix, suffix as id variables
m1_nd m1_rt m1_st m2_nd m2_rt m2_st m3_nd m3_rt m3_st
P001 NaN NaN 60.0 NaN NaN 2.000 NaN NaN 1
P003 14.30 29.6 NaN 2.077 2.077 NaN 1 1 NaN
P006 79.30 NaN 10.3 2.077 NaN 2.077 1 NaN 1
P008 9.16 NaN NaN 2.077 NaN NaN 1 NaN NaN
To correct the column names and zeros instead of NaN, do
names(data.cast) <- gsub("_","",names(data.cast))
apply(data.cast,c(1,2),function(x){as.numeric(ifelse(is.na(x),0,x)) })
To get
m1nd m1rt m1st m2nd m2rt m2st m3nd m3rt m3st
P001 0.00 0.0 60.0 0.000 0.000 2.000 0 0 1
P003 14.30 29.6 0.0 2.077 2.077 0.000 1 1 0
P006 79.30 0.0 10.3 2.077 0.000 2.077 1 0 1
P008 9.16 0.0 0.0 2.077 0.000 0.000 1 0 0

Try this:
library(tidyr)
library(dplyr)
library(reshape2)
library(stringr)
data <-
structure(list(m1 = c(60, 14.3, 29.6, 10.3, 79.3, 9.16),
m2 = c(2, 2.077, 2.077, 2.077, 2.077, 2.077),
m3 = c(1L, 1L, 1L, 1L, 1L, 1L)),
.Names = c("m1", "m2", "m3"),
class = "data.frame",
row.names = c("P001.st", "P003.nd", "P003.rt",
"P006.st", "P006.nd", "P008.nd"))
my_data <-
as_data_frame(cbind(col_01 = rownames(data), data)) %>%
melt(.) %>%
separate(., col_01, into = c("var_01", "var_02"), sep = "\\.") %>%
mutate(my_var = str_c(variable, var_02)) %>%
select(var_01, my_var, value) %>%
arrange(var_01, my_var) %>%
spread(., my_var, value)
my_data
var_01 m1nd m1rt m1st m2nd m2rt m2st m3nd m3rt m3st
1 P001 NA NA 60.0 NA NA 2.000 NA NA 1
2 P003 14.30 29.6 NA 2.077 2.077 NA 1 1 NA
3 P006 79.30 NA 10.3 2.077 NA 2.077 1 NA 1
4 P008 9.16 NA NA 2.077 NA NA 1 NA NA
If you want to replace NAs with 0, you can do it like this:
my_data[is.na(my_data)] <- 0
var_01 m1nd m1rt m1st m2nd m2rt m2st m3nd m3rt m3st
1 P001 0.00 0.0 60.0 0.000 0.000 2.000 0 0 1
2 P003 14.30 29.6 0.0 2.077 2.077 0.000 1 1 0
3 P006 79.30 0.0 10.3 2.077 0.000 2.077 1 0 1
4 P008 9.16 0.0 0.0 2.077 0.000 0.000 1 0 0

Using extract() instead of separate() with the more flexible regular expressions, using tidyr and dplyr:
df %>%
extract(id, c("id2", "var"), c("(P00.)\\.(..)")) %>%
gather(variable,value,c(m1,m2,m3)) %>%
mutate(var=paste0(variable,".",var)) %>%
select(-variable) %>%
spread(var,value,fill=0)
id2 m1.nd m1.rt m1.st m2.nd m2.rt m2.st m3.nd m3.rt m3.st
1 P001 0.00 0.0 60.0 0.000 0.000 2.000 0 0 1
2 P003 14.30 29.6 0.0 2.077 2.077 0.000 1 1 0
3 P006 79.30 0.0 10.3 2.077 0.000 2.077 1 0 1
4 P008 9.16 0.0 0.0 2.077 0.000 0.000 1 0 0

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R plot function gives out weird answer - r

Related

get the name of child list in a list in R with lapply function

R variable evaluates differently depending on context - in loop or not

How can I get the recapture probabilities in R (which package to use) ?

Organizing three dimensional data from table into matrix/array form using R

Reformat data tables based on row names to generate new columns in R

Categories

Resources