plotting a two column data frame with date - r

I'm having trouble puting this little data.frame into a plot. I use the plot() fx but it just gives me back a plot which X axis is not the date in the first column.
> DDDhabd
Mes DDD.1000hab.día
1 Ene-14 0.03564701
2 Feb-14 0.03959695
3 Mar-14 0.04677090
4 Abr-14 0.04928782
5 May-14 0.03783808
6 Jun-14 0.04939231
7 Jul-14 0.05464189
8 Ago-14 0.05208003
9 Set-14 0.05475650
10 Oct-14 0.05290589
11 Nov-14 0.05714252
12 Dic-14 0.05056313
13 Ene-15 0.05688352
14 Feb-15 0.05710022
15 Mar-15 0.05754084
16 Abr-15 0.04362755
17 May-15 0.06209153
18 Jun-15 0.05715994
19 Jul-15 0.04373711
20 Ago-15 0.02462424
21 Set-15 0.03812404
22 Oct-15 0.08368198
23 Nov-15 0.07506378
24 Dic-15 0.05974877
I would really appreciate if you could give me a hint about where is my mistake.
Thanks

Related

Barplot in R fill with certain values

I have a dataset which contains the data of pairs playing a game. I have a barplot that shows the total games played by the pairs. But now I want those bars('number') to be filled with the amount of games they successfully completed('sum'). I can't get it to work. The barplot is created like this:
barplot(height = game_count$number, xlab = 'Pairs', ylim = c(0,35), ylab='Games played')
The data looks like this:
participants sum number
1 06104873220647518670 30 32
2 06105747340637377404 23 24
3 06113978630633565020 28 32
4 06121794480617858550 25 27
5 06122613960611857952 23 26
6 06123139380653583516 25 28
7 06123650620648276595 28 32
8 06124453210624910109 32 34
9 06127993700610846968 24 26
10 06128440030639764541 19 24
11 06132461300624244572 26 30
12 06137611390651588167 25 28
13 06145014400637290807 16 19
14 06163181050611257617 30 30
15 06172024240651919112 21 23
One option can be ggplot2:
library(ggplot2)
#Code
game_count$Freq <- game_count$sum/game_count$number
#Plot
ggplot(game_count,aes(x=1:nrow(game_count),y=Freq))+
geom_col(fill='cyan3',color='black')+
xlab('')
Output:
This worked for me:
barplot(t(game_correct[c('number', 'sum')]), beside=TRUE, ylim=c(0,35), col=c('black', 'green'), main='Games played and successive games by the pairs', xlab='Pairs', ylab='Games')
Result in this graph:

rpart -- number of splits

Using printcp I got output resembling the following (this is only a portion):
CP nsplit rel error xerror xstd
1 3.254666e-01 0 1.0000000 1.0000000 0.003976889
2 5.395058e-02 1 0.6745334 0.6745334 0.003567289
3 4.125633e-02 3 0.5666322 0.5878145 0.003401065
4 1.726150e-02 4 0.5253759 0.5492028 0.003317552
5 1.222830e-02 7 0.4735914 0.4925069 0.003183022
6 1.193864e-02 10 0.4364909 0.4744730 0.003137010
7 9.243634e-03 12 0.4126137 0.4489081 0.003068901
8 5.238899e-03 13 0.4033700 0.4277007 0.003009687
9 3.878800e-03 14 0.3981311 0.4183311 0.002982702
10 3.664710e-03 16 0.3903735 0.4115054 0.002962714
11 3.261718e-03 18 0.3830441 0.4098935 0.002957953
12 2.934287e-03 20 0.3765207 0.4063421 0.002947406
13 2.871320e-03 24 0.3647835 0.4044783 0.002941839
14 2.770571e-03 25 0.3619122 0.4000201 0.002928437
15 2.052742e-03 26 0.3591416 0.3973503 0.002920351
16 1.989774e-03 28 0.3550361 0.3924892 0.002905511
17 1.813465e-03 29 0.3530464 0.3911795 0.002901486
18 1.763091e-03 30 0.3512329 0.3880563 0.002891845
19 1.737904e-03 31 0.3494698 0.3863688 0.002886609
20 1.674936e-03 32 0.3477319 0.3832708 0.002876947
21 1.670739e-03 35 0.3422915 0.3830693 0.002876317
22 1.662343e-03 39 0.3355666 0.3827167 0.002875212
23 1.653947e-03 40 0.3339042 0.3824900 0.002874502
Which value shows the total number of splits in the tree -- nsplit, or the largest index (left-most column)? (I.e., 23 or 40?)
The table your are seeing from the printcp function is the $cptable object from your CART model. Column "nsplit" shows the number of splits, indeed.
So, you can get the total number of splits in the tree with
max(carttree$cptable[,"nsplit"])
Where carttree is the name of your CART tree.

Morans correlogram with only one point. What is wrong?

Im trying Moran's I and respective plot in r. But the plot has only one point. I have no idea of what is going wrong. The code is based on<
http://rstudio-pubs-static.s3.amazonaws.com/9688_a49c681fab974bbca889e3eae9fbb837.html>
my data called "coordenata"
resid x y
1 0.07785411 -53.20342 -22.66700
2 -0.28358702 -53.20389 -22.66864
3 -0.64011338 -53.21392 -22.68122
4 1.22071249 -53.21311 -22.72369
5 0.95734778 -53.28469 -22.75289
6 0.35345302 -53.25822 -22.74850
7 -0.68357738 -53.28344 -22.70694
8 -1.24596010 -53.32950 -22.72872
9 -0.19944162 -53.33669 -22.73561
10 0.67544909 -53.36756 -22.80767
11 0.64002961 -53.35947 -22.79958
12 0.04564233 -53.21889 -22.67419
13 0.01618436 -53.24522 -22.70144
14 -2.65436794 -53.23017 -22.69292
15 0.72096256 -53.25539 -22.69978
16 0.89656515 -53.28489 -22.72222
17 1.85358579 -53.33069 -22.79161
18 -0.03590077 -53.33200 -22.78336
19 0.32348975 -53.33494 -22.78586
20 2.06771402 -53.37781 -22.77869
21 -1.02190709 -53.30492 -22.77244
22 -2.02813250 -53.53917 -22.79856
23 -1.20702445 -53.53858 -22.79406
24 -1.24091732 -53.55272 -22.80536
25 -1.13491596 -53.56181 -22.82914
26 -0.82934613 -53.56422 -22.83417
27 1.23418758 -53.60017 -22.85531
28 -1.72808514 -53.65900 -22.97828
29 -0.02144049 -53.65908 -22.97497
30 0.49174568 -53.64597 -22.95439
31 -0.54408149 -53.64217 -22.91033
32 -0.37111342 -53.61447 -22.86269
33 -0.31121931 -53.27153 -22.70036
34 0.32419211 -53.30308 -22.72183
35 1.57980287 -53.33053 -22.72947
36 -1.91156060 -53.34633 -22.74722
37 -0.79036645 -53.23667 -22.68925
the code
coordinates(coordenata)<-c("x","y")
fit2<-correlog(coordenata$x,coordenata$y,coordenata$resid,increment=5,resamp=100,quiet=T)
plot(fit2)
Thanks in advance for any help!

How do I get rid of commas and periods, etc in R? [duplicate]

This question already has answers here:
How to load comma separated data into R?
(2 answers)
Closed 6 years ago.
This is my data set:
Depth.Fe
1 0,14.21
2 3,19.35
3 10,17.22
4 14,15.87
5 23,13.62
6 30,16.31
7 36,14.13
8 48,13.95
9 59,15
10 66,14.23
11 68,16.81
12 81,15.93
13 94,16.02
14 96,17.85
15 102,17.02
16 115,15.87
17 121,19.84
18 130,16.94
19 163,16.72
20 168,19.2
21 205,20.41
22 239,16.88
23 251,18.74
24 283,16.67
25 297,18.56
26 322,18.87
27 335,20.81
28 351,24.52
29 370,25.03
30 408,25.11
31 416,23.28
32 419,22.56
33 425,19
34 429,20.53
35 443,19.08
36 447,22.83
37 465,21.06
38 474,24.96
39 493,19.12
40 502,22.24
41 522,26.88
42 550,21.15
43 558,28.92
44 571,27.96
45 586,25.03
46 596,26.27
I want depth and Fe to be separated as individual columns, but nothing I try is working.
please help
First of all, #akrun is definitely right in his comment to your post. If this is a dataset imported from somewhere, then follow his comment.
Assuming that somehow you were handed this weird dataset, I would try this:
df <- data.frame(matrix(as.numeric(unlist(strsplit(df$Depth.Fe,split=","))),nrow=2,byrow = T),stringsAsFactors = F)
colnames(df) <- c("Depth","Fe")
This would take a dataset that looks like this:
Depth.Fe
1 0,14.21
2 3,19.35
to this:
Depth Fe
1 0 14.21
2 3 19.34

R sorts a vector on its own accord

df.sorted <- c("binned_walker1_1.grd", "binned_walker1_2.grd", "binned_walker1_3.grd",
"binned_walker1_4.grd", "binned_walker1_5.grd", "binned_walker1_6.grd",
"binned_walker2_1.grd", "binned_walker2_2.grd", "binned_walker3_1.grd",
"binned_walker3_2.grd", "binned_walker3_3.grd", "binned_walker3_4.grd",
"binned_walker3_5.grd", "binned_walker4_1.grd", "binned_walker4_2.grd",
"binned_walker4_3.grd", "binned_walker4_4.grd", "binned_walker4_5.grd",
"binned_walker5_1.grd", "binned_walker5_2.grd", "binned_walker5_3.grd",
"binned_walker5_4.grd", "binned_walker5_5.grd", "binned_walker5_6.grd",
"binned_walker6_1.grd", "binned_walker7_1.grd", "binned_walker7_2.grd",
"binned_walker7_3.grd", "binned_walker7_4.grd", "binned_walker7_5.grd",
"binned_walker8_1.grd", "binned_walker8_2.grd", "binned_walker9_1.grd",
"binned_walker9_2.grd", "binned_walker9_3.grd", "binned_walker9_4.grd",
"binned_walker10_1.grd", "binned_walker10_2.grd", "binned_walker10_3.grd")
One would expect that order of this vector would be 1:length(df.sorted), but that appears not to be the case. It looks like R internally sorts the vector according to its logic but tries really hard to display it the way it was created (and is seen in the output).
order(df.sorted)
[1] 37 38 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
[26] 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Is there a way to "reset" the ordering to 1:length(df.sorted)? That way, ordering, and the output of the vector would be in sync.
Use the mixedsort (or) mixedorder functions in package gtools:
require(gtools)
mixedorder(df.sorted)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
[28] 28 29 30 31 32 33 34 35 36 37 38 39
construct it as an ordered factor:
> df.new <- ordered(df.sorted,levels=df.sorted)
> order(df.new)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
EDIT :
After #DWins comment, I want to add that it is even not nessecary to make it an ordered factor, just a factor is enough if you give the right order of levels :
> df.new2 <- factor(df.sorted,levels=df.sorted)
> order(df.new)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
The difference will be noticeable when you use those factors in a regression analysis, they can be treated differently. The advantage of ordered factors is that they let you use comparison operators as < and >. This makes life sometimes a lot easier.
> df.new2[5] < df.new2[10]
[1] NA
Warning message:
In Ops.factor(df.new[5], df.new[10]) : < not meaningful for factors
> df.new[5] < df.new[10]
[1] TRUE
Isn't this simply the same thing you get with all lexicographic shorts (as e.g. ls on directories) where walker10_foo sorts higher than walker1_foo?
The easiest way around, in my book, is to use a consistent number of digits, i.e. I would change to binned_walker01_1.grd and so on inserting a 0 for the one-digit counts.
In response to Dwin's comment on Dirk's answer: the data are always putty in your hands. "This is R. There is no if. Only how." -- Simon Blomberg
You can add 0 like so:
df.sorted <- gsub("(walker)([[:digit:]]{1}_)", "\\10\\2", df.sorted)
If you needed to add 00, you do it like this:
df.sorted <- gsub("(walker)([[:digit:]]{1}_)", "\\10\\2", df.sorted)
df.sorted <- gsub("(walker)([[:digit:]]{2}_)", "\\10\\2", df.sorted)
...and so on.

Resources