R: Plotting Multiple Graphs using a "for loop" - r

I am using the R programming language.
Using the following code, I am able to put two plots on the same page:
#load library
library(dbscan)
#specify number of plots per page
par(mfrow = c(1,2))
#load libraries
library(dbscan)
library(dplyr)
#generate data
n <- 100
x <- cbind(
x=runif(10, 0, 5) + rnorm(n, sd=0.4),
y=runif(10, 0, 5) + rnorm(n, sd=0.4)
)
### calculate LOF score
lof <- lof(x, k=3)
### distribution of outlier factors (first plot)
summary(lof)
hist(lof, breaks=10)
### point size is proportional to LOF (second plot)
plot(x, pch = ".", main = "LOF (k=3)")
points(x, cex = (lof-1)*3, pch = 1, col="red")
This produces the following plot:
Now, I am trying to make several plots (e.g. 6 plots, 2 pairs of 3) on the same page. I tried to implement this with a "for loop" (for k = 3, 4, 5):
par(mfrow = c(3,2))
vals <- 3:5
combine <- vector('list', length(vals))
count <- 0
for (i in vals) {
lof_i <- lof(x, k=i)
### distribution of outlier factors
summary(lof_i)
hist(lof_i, breaks=10)
### point size is proportional to LOF
plot(x, pch = ".", main = "LOF (k=i)")
points(x, cex = (lof_i-1)*3, pch = 1, col="red")
}
However, this seems to just repeat the same graph 6 times on the same page:
Can someone please show me how to correct this code?
Is it also possible to save the files "lof_3, lof_4, lof_5"? It seems that none of these files are created, only "lof_i" is created:
> lof_3
Error: object 'lof_3' not found
> head(lof_i)
[1] 1.223307 1.033424 1.077149 1.011407 1.040634 1.431029
Thanks

Looking at your plots you seem to have generated and plotted different plots, but to have the labels correct you need to pass a variable and not a fixed character to your title (e.g. using the paste command).
To get the calculated values out of your loop you could either generate an empty list and assign the results in the loop to individual list elements, or use something like lapply that will automatically return the results in a list form.
To simplify things a bit you could define a function that either plots or returns the calculated values, e.g. like this:
library(dbscan)
#generate data
set.seed(123)
n <- 100
x <- cbind(
x=runif(10, 0, 5) + rnorm(n, sd=0.4),
y=runif(10, 0, 5) + rnorm(n, sd=0.4)
)
plotLOF <- function(i, plot=TRUE){
lof <- lof(x, k=i)
if (plot){
hist(lof, breaks=10)
plot(x, pch = ".", main = paste0("LOF (k=", i, ")"))
points(x, cex = (lof-1)*3, pch = 1, col="red")
} else return(lof)
}
par(mfrow = c(3,2))
invisible(lapply(3:5, plotLOF))
lapply(3:5, plotLOF, plot=FALSE)
#> [[1]]
#> [1] 1.1419243 0.9551471 1.0777472 1.1224447 0.8799095 1.0377858 0.8416306
#> [8] 1.0487133 1.0250496 1.3183819 0.9896833 1.0353398 1.3088266 1.0123238
#> [15] 1.1233530 0.9685039 1.0589151 1.3147785 1.0488644 0.9212146 1.2568698
#> [22] 1.0086274 1.0454450 0.9661698 1.0644528 1.1107202 1.0942201 1.5147076
#> [29] 1.0321698 1.0553455 1.1149748 0.9341090 1.2352716 0.9478602 1.4096464
#> [36] 1.0519127 1.0507267 1.3199825 1.2525485 0.9361488 1.0958563 1.2131615
#> [43] 0.9943090 1.0123238 1.1060491 1.0377766 0.9803135 0.9627699 1.1165421
#> [50] 0.9796819 0.9946925 2.1576989 1.6015310 1.5670315 0.9343637 1.0033725
#> [57] 0.8769431 0.9783065 1.0800050 1.2768800 0.9735274 1.0377472 1.0743988
#> [64] 1.7583562 1.2662485 0.9685039 1.1662145 1.2491499 1.1131718 1.0085023
#> [71] 0.9636864 1.1538360 1.2126138 1.0609829 1.0679010 1.0490234 1.1403292
#> [78] 0.9638900 1.1863703 0.9651060 0.9503445 1.0098536 0.8440855 0.9052420
#> [85] 1.2662485 1.4447713 1.0845415 1.0661381 0.9282678 0.9380078 1.1414628
#> [92] 1.0407138 1.0942201 1.0589805 1.0370938 1.0147094 1.1067291 0.8834466
#> [99] 1.7027132 1.1766560
#>
#> [[2]]
#> [1] 1.1667311 1.0409009 1.0920953 1.0068953 0.9894195 1.1332413 0.9764505
#> [8] 1.0228796 1.0446905 1.0893386 1.1211637 1.1029415 1.3453498 0.9712910
#> [15] 1.1635936 1.0265746 0.9480282 1.2144437 1.0570346 0.9314618 1.3345561
#> [22] 0.9816097 0.9929112 1.0322014 1.2739621 1.2947553 1.0202948 1.6153264
#> [29] 1.0790922 0.9987830 1.0378609 0.9622779 1.2974938 0.9129639 1.2601398
#> [36] 1.0265746 1.0241622 1.2420568 1.2204376 0.9297345 1.1148404 1.2546361
#> [43] 1.0059582 0.9819820 1.0342491 0.9452673 1.0369500 0.9791091 1.2000825
#> [50] 0.9878844 1.0205586 2.0057587 1.2757014 1.5347815 0.9622614 1.0692613
#> [57] 1.0026404 0.9408510 1.0280687 1.3534531 0.9669894 0.9300601 0.9929112
#> [64] 1.7567871 1.3861828 1.0265746 1.1120151 1.3542396 1.1562077 0.9842179
#> [71] 1.0301098 1.2326327 1.1866352 1.0403814 1.0577086 0.8745912 1.0017905
#> [78] 0.9904356 1.0602487 0.9501681 1.0176457 1.0405430 0.9718224 1.0046821
#> [85] 1.1909982 1.6151918 0.9640852 1.0141963 1.0270237 0.9867738 1.1474414
#> [92] 1.1293307 1.0323945 1.0859417 0.9622614 1.0290635 1.0186381 0.9225209
#> [99] 1.6456612 1.1366753
#>
#> [[3]]
#> [1] 1.1299335 1.0122028 1.2077092 0.9485150 1.0115694 1.1190314 0.9989174
#> [8] 1.0145663 1.0357546 0.9783702 1.1050504 1.0661798 1.3571416 1.0024603
#> [15] 1.1484745 1.0162149 0.9601474 1.1310442 1.0957731 1.0065501 1.2687934
#> [22] 0.9297323 0.9725355 0.9876444 1.2314822 1.2209304 0.9906446 1.4249452
#> [29] 1.2156607 0.9959685 1.0304305 0.9976110 1.1711354 1.0048161 0.9813000
#> [36] 1.0128909 0.9730295 1.1741982 1.3317209 0.9708714 1.0994309 1.1900047
#> [43] 0.9960765 0.9659553 0.9744357 0.9556112 1.0508484 0.9669406 1.3919743
#> [50] 0.9467537 1.0596883 1.7396644 1.1323109 1.6516971 0.9922995 1.0223594
#> [57] 0.9917594 0.9542419 1.0672565 1.2274498 1.0589385 0.9649404 0.9953886
#> [64] 1.7666795 1.3111620 0.9860706 1.0576620 1.2547512 1.0038281 0.9825967
#> [71] 1.0104708 1.1739417 1.1884817 1.0199412 0.9956941 0.9720389 0.9601474
#> [78] 0.9898781 1.1025485 0.9797453 1.0086780 1.0556471 1.0150204 1.0339022
#> [85] 1.1174116 1.5252177 0.9721734 0.9486663 1.0161640 0.9903872 1.2339874
#> [92] 1.0753099 0.9819882 1.0439012 1.0016272 1.0122706 1.0536213 0.9948601
#> [99] 1.4693656 1.0274264
Created on 2021-02-22 by the reprex package (v1.0.0)

for i in vector
eval(parse(text = sprintf("plot(df$%s)",i)))
This is very powerful line of code...can be very handy to plot graphs with loops.
{
eval(parse(text= sprintf('lof_%s <- lof(x, k=%s)',i,i)))
### distribution of outlier factors
eval(parse(text=sprintf('summary(lof_%s)',i)))
eval(parse(text=sprintf('hist(lof_%s, breaks=10)',i)))
### point size is proportional to LOF
eval(parse(text=sprintf("plot(x, pch = '.', main = 'LOF (k=%s)')",i)))
eval(parse(text=sprintf("points(x, cex = (lof_%s-1)*3, pch = 1, col='red')",i)))
}```
Exaplaination-
eval() - it evaluates the expression
parse() - it parse the text for evaluation
sprintf() - it creates a string(text) by concatenating with the parameter parsed.
Your code is not working because inside the loop i is being treated as character. It is not holding the values from the iterator.In case you need to understand above function then i would suggest you to just run this function and see the output sprintf('lof_%s <- lof(x, k=%s)',i,i).

Related

Split a sequence of numbers into groups of 10 digits using R

I would like for R to read in the first 10,000 digits of Pi and group every 10 digits together
e.g., I want R to read in a sequence
pi <- 3.14159265358979323846264338327950288419716939937510582097...
and would like R to give me a table where each row contains 10 digit:
3141592653
5897932384
6264338327
...
I am new to R and really don't know where to start so any help would be much appreciated!
Thank you in advance
https://rextester.com/OQRM27791
p <- strsplit("314159265358979323846264338327950288419716939937510582097", "")
digits <- p[[1]]
split(digits, ceiling((1:length(digits)) / 10));
Here's one way to do it. It's fully reproducible, so just cut and paste it into your R console. The vector result is the first 10,000 digits of pi, split into 1000 strings of 10 digits.
For this many digits, I have used an online source for the precalculated value of pi. This is read in using readChar and the decimal point is stripped out with gsub. The resulting string is split into individual characters and put in a 1000 * 10 matrix (filled row-wise). The rows are then pasted into strings, giving the result. I have displayed only the first 100 entries of result for clarity of presentation.
pi_url <- "https://www.pi2e.ch/blog/wp-content/uploads/2017/03/pi_dec_1m.txt"
pi_char <- gsub("\\.", "", readChar(url, 1e4 + 1))
pi_mat <- matrix(strsplit(pi_char, "")[[1]], byrow = TRUE, ncol = 10)
result <- apply(pi_mat, 1, paste0, collapse = "")
head(result, 100)
#> [1] "3141592653" "5897932384" "6264338327" "9502884197" "1693993751"
#> [6] "0582097494" "4592307816" "4062862089" "9862803482" "5342117067"
#> [11] "9821480865" "1328230664" "7093844609" "5505822317" "2535940812"
#> [16] "8481117450" "2841027019" "3852110555" "9644622948" "9549303819"
#> [21] "6442881097" "5665933446" "1284756482" "3378678316" "5271201909"
#> [26] "1456485669" "2346034861" "0454326648" "2133936072" "6024914127"
#> [31] "3724587006" "6063155881" "7488152092" "0962829254" "0917153643"
#> [36] "6789259036" "0011330530" "5488204665" "2138414695" "1941511609"
#> [41] "4330572703" "6575959195" "3092186117" "3819326117" "9310511854"
#> [46] "8074462379" "9627495673" "5188575272" "4891227938" "1830119491"
#> [51] "2983367336" "2440656643" "0860213949" "4639522473" "7190702179"
#> [56] "8609437027" "7053921717" "6293176752" "3846748184" "6766940513"
#> [61] "2000568127" "1452635608" "2778577134" "2757789609" "1736371787"
#> [66] "2146844090" "1224953430" "1465495853" "7105079227" "9689258923"
#> [71] "5420199561" "1212902196" "0864034418" "1598136297" "7477130996"
#> [76] "0518707211" "3499999983" "7297804995" "1059731732" "8160963185"
#> [81] "9502445945" "5346908302" "6425223082" "5334468503" "5261931188"
#> [86] "1710100031" "3783875288" "6587533208" "3814206171" "7766914730"
#> [91] "3598253490" "4287554687" "3115956286" "3882353787" "5937519577"
#> [96] "8185778053" "2171226806" "6130019278" "7661119590" "9216420198"
Created on 2020-07-23 by the reprex package (v0.3.0)
We can use str_extract:
pi <- readLines("https://www.pi2e.ch/blog/wp-content/uploads/2017/03/pi_dec_1m.txt")
library(stringr)
t <- unlist(str_extract_all(sub("\\.","", pi), "\\d{10}"))
t[1:100]
[1] "3141592653" "5897932384" "6264338327" "9502884197" "1693993751" "0582097494" "4592307816" "4062862089"
[9] "9862803482" "5342117067" "9821480865" "1328230664" "7093844609" "5505822317" "2535940812" "8481117450"
[17] "2841027019" "3852110555" "9644622948" "9549303819" "6442881097" "5665933446" "1284756482" "3378678316"
[25] "5271201909" "1456485669" "2346034861" "0454326648" "2133936072" "6024914127" "3724587006" "6063155881"
[33] "7488152092" "0962829254" "0917153643" "6789259036" "0011330530" "5488204665" "2138414695" "1941511609"
[41] "4330572703" "6575959195" "3092186117" "3819326117" "9310511854" "8074462379" "9627495673" "5188575272"
[49] "4891227938" "1830119491" "2983367336" "2440656643" "0860213949" "4639522473" "7190702179" "8609437027"
[57] "7053921717" "6293176752" "3846748184" "6766940513" "2000568127" "1452635608" "2778577134" "2757789609"
[65] "1736371787" "2146844090" "1224953430" "1465495853" "7105079227" "9689258923" "5420199561" "1212902196"
[73] "0864034418" "1598136297" "7477130996" "0518707211" "3499999983" "7297804995" "1059731732" "8160963185"
[81] "9502445945" "5346908302" "6425223082" "5334468503" "5261931188" "1710100031" "3783875288" "6587533208"
[89] "3814206171" "7766914730" "3598253490" "4287554687" "3115956286" "3882353787" "5937519577" "8185778053"
[97] "2171226806" "6130019278" "7661119590" "9216420198"

Interpolation of loess.smooth in R

I'm running a loess.smooth method after running the spline method on it.
The input given below is the data I get after running the spline method.
However I'm going wrong with the loess.smooth method. The entire first column is returning the output in float format but I need it in integer format with an increment of 1.
Any help would be much appreciated.
Thanks
**input:** spline_file
1 0.157587435
2 0.146704412
3 0.129899285
4 0.138925582
5 0.104085676
out <- loess.smooth(spline_file$x, spline_file$y, span = 1, degree = 1,
family = c("gaussian"), length.out = seq(1, max_exp, by = 1), surface=
"interpolate", normalize = TRUE, method="linear")
**OUTPUT:**
0 0.150404703
1.020408163 0.154413716
2.040816327 0.158458172
3.06122449 0.162515428
4.081632653 0.166562839
5.102040816 0.170577762
**OUTPUT REQUIRED:**
x y
1 0.225926707
2 0.226026551
3 0.226241194
4 0.2265471
5 0.226920733
not sure if the following fully answers your question but maybe it helps. Below some code, demonstrative plot and some explanations/recommendations.
You should not use a degree of 1, your data requires a higher degree.
You should check the allowed parameters via ?loess.smooth. I think you mixed up some parameters of scatter.smooth and loess.smooth and further used some parameters that do not exist for the function (e.g. normalize - please correct me if I have overseen something).
In any case it makes sense that the output of a spline smoothing function has more data points than the original data. To be ablet to plot a smooth curve additional points are generated between your data points by the smoothing function. Check the plot generated at the end of below code. If the fit is good, is another question...
spline_file <- read.table(text = "
1 0.157587435
2 0.146704412
3 0.129899285
4 0.138925582
5 0.104085676
", stringsAsFactors = FALSE)
colnames(spline_file) <- c("x", "y")
spline_loess <- loess.smooth(spline_file$x, spline_file$y, span = 1, degree = 2,
family = c("gaussian")
,surface= "interpolate"
, statistics = "exact"
)
spline_loess
# $x
# [1] 1.000000 1.081633 1.163265 1.244898 1.326531 1.408163 1.489796
# [8] 1.571429 1.653061 1.734694 1.816327 1.897959 1.979592 2.061224
# [15] 2.142857 2.224490 2.306122 2.387755 2.469388 2.551020 2.632653
# [22] 2.714286 2.795918 2.877551 2.959184 3.040816 3.122449 3.204082
# [29] 3.285714 3.367347 3.448980 3.530612 3.612245 3.693878 3.775510
# [36] 3.857143 3.938776 4.020408 4.102041 4.183673 4.265306 4.346939
# [43] 4.428571 4.510204 4.591837 4.673469 4.755102 4.836735 4.918367
# [50] 5.000000
#
# $y
# [1] 0.1586807 0.1571512 0.1556485 0.1541759 0.1527367 0.1513344
# [7] 0.1499721 0.1486533 0.1473813 0.1461595 0.1449911 0.1438795
# [13] 0.1428280 0.1417881 0.1406496 0.1394364 0.1381783 0.1369053
# [19] 0.1356473 0.1344341 0.1332957 0.1322619 0.1313626 0.1306278
# [25] 0.1300873 0.1297791 0.1297453 0.1299324 0.1302747 0.1307066
# [31] 0.1311626 0.1315769 0.1318839 0.1320181 0.1319138 0.1315054
# [37] 0.1307273 0.1295270 0.1281453 0.1266888 0.1251504 0.1235232
# [43] 0.1218002 0.1199744 0.1180388 0.1159866 0.1138105 0.1115038
# [49] 0.1090594 0.1064704
plot(spline_file)
lines(spline_loess)

R + ggplot2: plot time series with linear regression with changepoint

I have a time series data which has 2 variables (x,y) and I am currently using R base plot to generate a plot like this.
the red lines is a linear model fitted between 2 points.
The data looks likes this.
X
[1] 559.2 559.8 560.6 561.1 561.2 561.8
[7] 562.4 563.0 563.4 563.5 563.5 563.5
[13] 563.5 563.5 563.5 563.5 563.8 564.5
[19] 565.3 565.9 566.4 566.5 566.7 567.4
[25] 567.6 568.5 569.3 570.3 571.6 572.2
[31] 572.5 573.6 574.1 575.5 576.9 578.1
[37] 579.0 580.1 580.9 581.4 581.8 583.1
[43] 583.8 584.4 585.2 586.0 586.1 586.2
[49] 586.8 587.4
**y**
[1] 115.4375 115.3008 115.2069 115.3306 115.3900 115.1189 114.8619
[8] 114.7992 114.7117 114.4722 114.7031 115.1358 115.4811 115.4500
[15] 115.6347 115.8286 115.8361 115.7986 115.9169 116.1225 116.1803
[22] 116.3794 116.2872 116.2517 116.3411 116.4167 116.5108 116.2900
[29] 116.3456 116.3658 116.1547 116.2042 116.1517 116.2083 116.3642
[36] 116.4347 116.5428 116.5119 116.5925 116.3969 116.2614 116.3494
[43] 116.1242 116.1469 116.0872 116.1000 116.2319 116.1225 116.1069
[50] 116.1364
I am calculating the change point manually from X.
Is this kind of plot possible in ggplot2?i.e. using ggplot2 to loop through change points and fit linear model?
Any help would be appreciated. Thanks.
#create some fake data
segment1 = 100:1 + runif(100)*10
df1 = data.frame(value = segment1, time = 1:100, type="segment1")
segment2 = 75:1 + runif(75)*10
df2 = data.frame(value = segment2, time = 101:175, type="segment2")
segment3 = 50:1 + runif(50)*10
df3 = data.frame(value = segment3, time = 176:225, type="segment3")
data.complete = rbind(df1,df2,df3)
#create the plot
require(ggplot2)
g = ggplot(data.complete,aes(x=time,y=value))
g = g + geom_line()
g = g + geom_smooth(method = "lm",aes(group=type))
g
To have the underlying line graph connected the group aesthetic must be called in the smoother.

Understanding xgb.dump

I'm trying to understand the intuition about what is going on in the xgb.dump of a binary classification with an interaction depth of 1. Specifically how the same split is used twiced in a row (f38 < 2.5) (code lines 2 and 6)
The resulting output looks like this:
xgb.dump(model_2,with.stats=T)
[1] "booster[0]"
[2] "0:[f38<2.5] yes=1,no=2,missing=1,gain=173.793,cover=6317"
[3] "1:leaf=-0.0366182,cover=3279.75"
[4] "2:leaf=-0.0466305,cover=3037.25"
[5] "booster[1]"
[6] "0:[f38<2.5] yes=1,no=2,missing=1,gain=163.887,cover=6314.25"
[7] "1:leaf=-0.035532,cover=3278.65"
[8] "2:leaf=-0.0452568,cover=3035.6"
Is the difference between the first use of f38 and the second use of f38 simply the residual fitting going on? At first it seemed weird to me, and trying to understand exactly what's going on here!
Thanks!
Is the difference between the first use of f38 and the second use of f38 simply the residual fitting going on?
most likely yes - its updating the gradient after the first round and finding the same feature with split point in your example
Here's a reproducible example.
Note how I lower the learning rate in the second example and its finds the same feature, same split point again for all three rounds. In the first example it uses different features in all 3 rounds.
require(xgboost)
data(agaricus.train, package='xgboost')
train <- agaricus.train
dtrain <- xgb.DMatrix(data = train$data, label=train$label)
#high learning rate, finds different first split feature (f55,f28,f66) in each tree
bst <- xgboost(data = train$data, label = train$label, max_depth = 2, eta = 1, nrounds = 3,nthread = 2, objective = "binary:logistic")
xgb.dump(model = bst)
# [1] "booster[0]" "0:[f28<-9.53674e-07] yes=1,no=2,missing=1"
# [3] "1:[f55<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=1.71218"
# [5] "4:leaf=-1.70044" "2:[f108<-9.53674e-07] yes=5,no=6,missing=5"
# [7] "5:leaf=-1.94071" "6:leaf=1.85965"
# [9] "booster[1]" "0:[f59<-9.53674e-07] yes=1,no=2,missing=1"
# [11] "1:[f28<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=0.784718"
# [13] "4:leaf=-0.96853" "2:leaf=-6.23624"
# [15] "booster[2]" "0:[f101<-9.53674e-07] yes=1,no=2,missing=1"
# [17] "1:[f66<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=0.658725"
# [19] "4:leaf=5.77229" "2:[f110<-9.53674e-07] yes=5,no=6,missing=5"
# [21] "5:leaf=-0.791407" "6:leaf=-9.42142"
## changed eta to lower learning rate, finds same feature(f55) in first split of each tree
bst2 <- xgboost(data = train$data, label = train$label, max_depth = 2, eta = .01, nrounds = 3,nthread = 2, objective = "binary:logistic")
xgb.dump(model = bst2)
# [1] "booster[0]" "0:[f28<-9.53674e-07] yes=1,no=2,missing=1"
# [3] "1:[f55<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=0.0171218"
# [5] "4:leaf=-0.0170044" "2:[f108<-9.53674e-07] yes=5,no=6,missing=5"
# [7] "5:leaf=-0.0194071" "6:leaf=0.0185965"
# [9] "booster[1]" "0:[f28<-9.53674e-07] yes=1,no=2,missing=1"
# [11] "1:[f55<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=0.016952"
# [13] "4:leaf=-0.0168371" "2:[f108<-9.53674e-07] yes=5,no=6,missing=5"
# [15] "5:leaf=-0.0192151" "6:leaf=0.0184251"
# [17] "booster[2]" "0:[f28<-9.53674e-07] yes=1,no=2,missing=1"
# [19] "1:[f55<-9.53674e-07] yes=3,no=4,missing=3" "3:leaf=0.0167863"
# [21] "4:leaf=-0.0166737" "2:[f108<-9.53674e-07] yes=5,no=6,missing=5"
# [23] "5:leaf=-0.0190286" "6:leaf=0.0182581"

Change scaling of data on the x-axis

I am having plot my data like that:
(dput(sale))
structure(c(-0.049668136, 0.023675638, -0.032249731, -0.071487224,
-0.034017265, -0.031278933, -0.052070721, -0.034305542, -0.019041209,
-0.050459175, -0.017315808, -0.012787003, -0.03341208, -0.045078144,
-0.036638132, -0.036533367, -0.012683656, -0.014388251, -0.006775188,
-0.037153807, -0.008941402, -0.011760677, -0.005077979, -0.041187417,
-0.001966554, -0.028822067, 0.021828558, 0.016208791, -0.026897492,
-0.032107207, -0.008496522, -0.028027096, -0.013746662, -0.004545603,
-0.005679941, -0.004614187, 0.004083014, -0.012624954, -0.016362079,
-0.006350167, -0.019551277), na.action = structure(42:45, class = "omit"))
[1] -0.049668136 0.023675638 -0.032249731 -0.071487224 -0.034017265
[6] -0.031278933 -0.052070721 -0.034305542 -0.019041209 -0.050459175
[11] -0.017315808 -0.012787003 -0.033412080 -0.045078144 -0.036638132
[16] -0.036533367 -0.012683656 -0.014388251 -0.006775188 -0.037153807
[21] -0.008941402 -0.011760677 -0.005077979 -0.041187417 -0.001966554
[26] -0.028822067 0.021828558 0.016208791 -0.026897492 -0.032107207
[31] -0.008496522 -0.028027096 -0.013746662 -0.004545603 -0.005679941
[36] -0.004614187 0.004083014 -0.012624954 -0.016362079 -0.006350167
[41] -0.019551277
attr(,"na.action")
[1] 42 43 44 45
attr(,"class")
[1] "omit"
(dput(purchase))
structure(c(0.042141187, 0.075875128, 0.090953485, 0.050951625,
0.082566915, 0.184396833, 0.136625887, 0.042725409, 0.135028692,
0.13201904, 0.093634104, 0.16776844, 0.13645719, 0.201365036,
0.227589832, 0.236473792, 0.269064385, 0.200981722, 0.144739536,
0.145256493, 0.040205545, 0.031577107, 0.014767345, 0.005843065,
0.034805051, 0.082493053, 0.010572227, 0.000645763, 0.033368236,
0.024326153, 0.038601182, 0.025446045, 0.000556418, 0.017201608,
0.008316872, 0.059722053, 0.059695415, 0.076940829, 0.067650014,
0.002029566, 0.008466334), na.action = structure(42:45, class = "omit"))
[1] 0.042141187 0.075875128 0.090953485 0.050951625 0.082566915 0.184396833
[7] 0.136625887 0.042725409 0.135028692 0.132019040 0.093634104 0.167768440
[13] 0.136457190 0.201365036 0.227589832 0.236473792 0.269064385 0.200981722
[19] 0.144739536 0.145256493 0.040205545 0.031577107 0.014767345 0.005843065
[25] 0.034805051 0.082493053 0.010572227 0.000645763 0.033368236 0.024326153
[31] 0.038601182 0.025446045 0.000556418 0.017201608 0.008316872 0.059722053
[37] 0.059695415 0.076940829 0.067650014 0.002029566 0.008466334
attr(,"na.action")
[1] 42 43 44 45
attr(,"class")
[1] "omit"
timeLine <- c(-20 , +20)
plot(sale,type="b", xlim=timeLine, ylim=c(-.1,.4) )
lines( purchase, type="b")
abline(v=0, col="black")
The plot I get looks like that:
Whats wrong with the plot is the scaling. My graphs should start at -20 and should got to +20 whereas each data point like -20, -19, -18, ..., +19, +20 is a point in the graph. In my exported csv sheet I have a row with these values. My question is, how to start from -20 so that every data point is an integer number to +20? Is is also possible to display every integer from -20 to +20?
I really appreciate your answer!
UPDATE
The scaling of the axis:
By, default the values are plotted against their index (starting at 1) when x is not specified in plot. You have to create a vector for the x axis.
timeLine <- c(-20 , 20)
# this command generates a sequence from -20 to 20
timeSeq <- Reduce(seq, timeLine)
# now, this sequence is passed to `x`
plot(sale, x = timeSeq, type = "b", xlim = timeLine, ylim = c(-.1, .4) )
lines(purchase, x = timeSeq, type = "b")
abline(v = 0, col = "black")
Update: how to show all x axis labels?
You can show all x axis labels if you decrease their size (cex.axis) and increase the width of the plot. Here's an example.
png("plot.png", width = 1000)
plot(sale,type="b", x = timeSeq, xlim=timeLine, ylim=c(-.1,.4),
xaxt = "n")
lines( purchase, type="b", x = timeSeq)
abline(v=0, col="black")
axis(side = 1, at = timeSeq, cex.axis = 0.75)
dev.off()

Resources