Change axises' scale in a plot without creating new varibale - r

I have a dataset like below (this is only the first 20 rows and the first 3 columns of data):
row fitted measured
1 1866 1950
2 2489 2500
3 1486 1530
4 1682 1720
5 1393 1402
6 2524 2645
7 2676 2789
8 3200 3400
9 1455 1456
10 1685 1765
11 2587 2597
12 3040 3050
13 2767 2769
14 3300 3310
15 4001 4050
16 1918 2001
17 2889 2907
18 2063 2150
19 1591 1640
20 3578 3601
I plotted this data
plot(data$measured~data$fitted, ylab = expression("Measured Length (" * mu ~ "m)"),
xlab = expression("NIR Fitted Length (" * mu ~ "m)"), cex.lab=1.5, cex.axis=1.5)
and got the following:
As you can see the axises scales are in micrometer, I need the axis to be in millimeter.
How can I plot the data while axises are in millimeter, WITHOUT creating a new variable?
Like this;
If I want to create a new variable, I have to change the whole 2000 lines code that I've written before and that's not a road that I want to go! :|
Thanks much :)

I used #bdemarest method for plot and #IukeA method for abline ;
plot(y=data$measured/1000,x=data$fitted/1000, ylab = expression("Measured Length (mm)"),
xlab = expression("NIR Fitted Length (mm)"), cex.lab=1.5, cex.axis=1.5)
a = lm(I(data$measured/1000)~I(data$fitted/1000), data=data)
abline(a)
Here is the final plot;

Related

Create a vector from a specific sequence of intervals

I have 20 intervals:
10 intervals from 1 to 250 of size 25:
[1.25] [26.50] [51.75] [76.100] [101.125] [126.150] ... [226.250]
10 intervals from 251 to 1000 of size 75:
[251,325] [326,400] [401,475] [476,550] [551,625] ... [926,1000]
I would like to create a vector composed of the first 5 elements of each interval like:
(1,2,3,5, 26,27,28,29,30, 51,52,53,54,55, 76,77,78,79,80, ....,
251,252,253,254,255, 326,327,328,329,330, ...)
How create this vector using R?
Let's assume you have two interval like :
interval1 <- seq(1.25, 226.250, 25)
interval2 <- seq(251, 1000, 75)
We can create a new interval combining the two and then use mapply to create sequence
new_interval <- c(as.integer(interval1), interval2)
c(mapply(`:`, new_interval, new_interval + 4))
#[1] 1 2 3 4 5 26 27 28 29 30 51 52 53 54 .....
#[89] ..... 779 780 851 852 853 854 855 926 927 928 929 930

Metafor Package (R) - Adding Multiple Columns

I've currently got a data frame as per below called final
final
Gms AvgPts max min Team salary position playing.status value Players
(dbl) (dbl) (dbl) (dbl) (fctr) (dbl) (fctr) (fctr) (dbl) (chr)
1 2 87.00 113 61 STK 4300 FWD Start 20.23 Tim Membrey, STK, FWD, 4300
2 4 75.50 124 42 STK 4300 MID Start 17.56 Blake Acres, STK, MID, 4300
3 7 77.43 119 50 STK 5500 RU Start 14.08 Tom Hickey, STK, RU, 5500
4 6 87.00 110 54 WCE 6200 RU Interchange 14.03 Scott Lycett, WCE, RU, 6200
5 5 71.40 89 39 STK 5200 FWD Interchange 13.73 Jack Sinclair, STK, FWD, 5200
6 3 73.33 83 68 WCE 5400 MID Start 13.58 Mark Hutchings, WCE, MID, 5400
7 7 98.71 127 83 STK 7400 MID Interchange 13.34 Sebastian Ross, STK, MID, 7400
8 7 79.14 99 53 WCE 6100 DEF Start 12.97 Jeremy McGovern, WCE, DEF, 6100
9 7 108.29 198 67 WCE 8500 FWD Start 12.74 Josh J. Kennedy, WCE, FWD, 8500
10 6 121.00 150 57 STK 9500 FWD Start 12.74 Nick Riewoldt, STK, FWD, 9500
11 6 79.17 101 59 STK 6400 MID Start 12.37 Luke Dunstan, STK, MID, 6400
12 7 84.86 104 60 STK 7000 DEF Start 12.12 Shane Savage, STK, DEF, 7000
13 7 82.14 100 45 WCE 6900 FWD Start 11.90 Jack Darling, WCE, FWD, 6900
14 7 95.29 138 76 WCE 8100 RU Start 11.76 Nic Naitanui, WCE, RU, 8100
15 7 87.43 135 53 WCE 7500 FWD Start 11.66 Mark LeCras, WCE, FWD, 7500
16 7 74.29 92 34 WCE 6400 DEF Start 11.61 Brad Sheppard, WCE, DEF, 6400
I use the following lines of code to produce a forest plot as shown below:
forest(x = final$AvgPts, ci.lb = final$min, ci.ub = final$max, slab = final$Players,ilab = final$value, ilab.xpos = max(final$max)+10,ilab.pos =4,yaxs="i", alim = c(min(final$min)-5, max(final$max)+5),steps = 4, xlim = c(min(final$min)-200, 2*(max(final$max)+5)), xlab = "Moneyball Points Spread", efac = 0.75-.0014*k, cex = 0.75, mgp = c(1, 1, 0),refline=mean(final$AvgPts),digits=1,col="dark blue",pch = 19,main=paste("2016 Moneyball Summary - pos =",paste(as.character(pos), collapse=", "),"\n(avg >=",points,"-- value >=",val,"-- TOG >=",time,")"))
text(min(final$min)-200, (nrow(final) + 1.5), "Player",pos=4,cex=0.75)
text(max(final$max)+10, (nrow(final) + 1.5), "Value",pos=4,cex=0.75)
text(2*(max(final$max)+5), (nrow(final) + 1.5), "Average[min,max]",pos=2,cex=0.75)
What I want to be able to do is add more columns than just the Value column (ilab = final$value).
Ideally I would like a solution which would be able to fit more than just one additional column as I intend to build on the final data frame with more information.
Also, is it possible to add extra columns before AND after the plot line section?
The ilab argument can also take an entire matrix or data frame (and ilab.pos and ilab.xpos should then be vectors). See help(forest.rma) for examples. And yes, if you adjust xlim so there is sufficient space to the left and the right of the points and CI lines, you can place the information to the left and right as well (just use ilab.xpos to specify where you want the various columns placed).

I want to repeat plotting for each section

I am trying to repeat plotting in R
my main command is
data<-read.csv(file.choose(),header=TRUE)
t=data[,1]
PCI=data[,2]
plot(t,PCI,xlim=c(0,30))
boxplot(t,PCI,xlim=c(0,30))
# some starting values
alpha=1
betha=1
theta=1
# do the fit
fit = nls(ydata ~ alpha-beta*exp((-theta*(xdata^gama))), start=list(alpha=80,beta=15,theta=15,gama=-2))
ydata=data[,2]
xdata=data[,1]
new = data.frame(xdata = seq(min(xdata),max(xdata),len=200))
lines(new$xdata,predict(fit,newdata=new))
resid(fit)
qqnorm(resid(fit))
fitted(fit)
resid(fit)
xdata=fitted(fit)
ydata=resid(fit)
plot(xdata,ydata,xlab="fitted values",ylab="Residuals")
abline(0, 0)
My first column is number of section, my second column is t=x and my third column is PCI=y. I want to repeat my plotting command for each section individually.but I think I can not use loop since the number of data in each section is not equal.
I would really appreciate your help since i am new in R.
SecNum t PCI
961 1 94.84
961 2 93.04
961 3 91.69
961 11 80.47
961 12 79.26
961 13 77
962 1 90.46
962 2 90.01
962 3 86.88
962 4 86.36
962 5 84.56
962 6 85.11
963 1 91.33
963 2 90.7
963 3 86.46
963 4 88.47
963 5 81.07
963 6 84.07
963 7 82.55
963 8 73.58
963 9 71.85
963 10 83.8
963 11 82.16
To repeat your code for each different SecNum in your data, do something like:
sections <- unique(data$SecNum)
for (sec in sections) {
# just get the data for that section
data.section <- subset(data, SecNum == sec)
# now do all your plotting commands. `data.section` is the
# subset of `data` that corresponds to this SecNum only.
}

lmList - loss of group information

I am using lmList to do linear models on many subsets of a data frame:
res <- lmList(Rds.on.fwd~Length | Wafer, data=sub, na.action=na.omit, pool=F)
This works fine, and I get the desired output (full output not shown):
(Intercept) Length
2492 5816.726 1571.260
2493 2520.311 1361.317
2494 3058.408 1286.516
2502 4727.328 1344.728
2564 3790.942 1576.223
2567 2350.296 1290.396
I have subsetted by "Wafer" (first column above). However, within my data frame ("sub"), the data is grouped by another factor "ERF" (there are many other factors but I am only concerned with "ERF"):
head(sub):
ERF Wafer Device Row Col Width Length Date Von.fwd Vth.fwd STS.fwd On.Off.fwd Ion.fwd Ioff.fwd Rds.on.fwd
1 474 2492 11.06E 11 6 100 5 09/10/2014 12:05 0.596747 3.05655 0.295971 7874420 0.000104 1.32e-11 9626.54
3 474 2492 11.08E 11 8 100 5 09/10/2014 12:05 0.581131 3.08380 0.299050 7890780 0.000109 1.38e-11 9193.62
5 474 2492 11.09E 11 9 100 5 09/10/2014 12:05 0.578171 3.06713 0.298509 8299740 0.000107 1.29e-11 9337.86
7 474 2492 11.10E 11 10 100 5 09/10/2014 12:05 0.565504 2.95532 0.298349 8138320 0.000109 1.34e-11 9173.15
9 474 2492 11.11E 11 11 100 5 09/10/2014 12:05 0.581289 2.97091 0.297885 8463620 0.000109 1.29e-11 9178.50
11 474 2492 11.12E 11 12 100 5 09/10/2014 12:05 0.578003 3.05802 0.294260 9326360 0.000112 1.20e-11 8955.51
I do not want ERF including in my lm but I do want to keep the factor "ERF" with the lm results for colouring graphs later i.e. I want this:
ERF Wafer (Intercept) Length
474 2492 5816.726 1571.260
474 2493 2520.311 1361.317
474 2494 3058.408 1286.516
475 2502 4727.328 1344.728
475 2564 3790.942 1576.223
476 2567 2350.296 1290.396
I know I could do this manually later by just adding a column to the results with a vector containing the correct sequence of ERF. However, I regularly add data to the set and dont want to do this every time. Im sure there is a more elegant way?
Thanks
Edit - data added for solution:
res <- ddply(sub, c("ERF", "Wafer"), function(x) coefficients(lm(Rds.on.fwd~Length,x)))
head(res)
ERF Wafer (Intercept) Length
1 474 2492 5816.726 1571.260
2 474 2493 2520.311 1361.317
3 474 2494 3058.408 1286.516
4 474 2502 4727.328 1344.728
5 479 2564 3790.942 1576.223
6 479 2567 2350.296 1290.396
If I drop ERF:
res <- ddply(sub, c("Wafer"), function(x) coefficients(lm(Rds.on.fwd~Length,x)))
head(res)
Wafer (Intercept) Length
1 2492 5816.726 1571.260
2 2493 2520.311 1361.317
3 2494 3058.408 1286.516
4 2502 4727.328 1344.728
5 2564 3790.942 1576.223
6 2567 2350.296 1290.396
Does this made sense? Did i ask the question incorrectly?
Ah, with a bit more research i've answer my own question based on this answer:
Regression on subset of data set
Must look harder next time. I used ddply instead of lmList (makes me wonder why anyone uses lmList...maybe I should ask another question?):
res1 <- ddply(sub, c("ERF", "Wafer"), function(x) coefficients(lm(Rds.on.fwd~Length,x)))

How ask R not to combine the X axis values for a bar chart?

I am a beginner with R . My data looks like this:
id count date
1 210 2009.01
2 400 2009.02
3 463 2009.03
4 465 2009.04
5 509 2009.05
6 861 2009.06
7 872 2009.07
8 886 2009.08
9 725 2009.09
10 687 2009.10
11 762 2009.11
12 748 2009.12
13 678 2010.01
14 699 2010.02
15 860 2010.03
16 708 2010.04
17 709 2010.05
18 770 2010.06
19 784 2010.07
20 694 2010.08
21 669 2010.09
22 689 2010.10
23 568 2010.11
24 584 2010.12
25 592 2011.01
26 548 2011.02
27 683 2011.03
28 675 2011.04
29 824 2011.05
30 637 2011.06
31 700 2011.07
32 724 2011.08
33 629 2011.09
34 446 2011.10
35 458 2011.11
36 421 2011.12
37 459 2012.01
38 256 2012.02
39 341 2012.03
40 284 2012.04
41 321 2012.05
42 404 2012.06
43 418 2012.07
44 520 2012.08
45 546 2012.09
46 548 2012.10
47 781 2012.11
48 704 2012.12
49 765 2013.01
50 571 2013.02
51 371 2013.03
I would like to make a bar graph like graph that shows how much what is the count for each date (dates in format of Month-Y, Jan-2009 for instance). I have two issues:
1- I cannot find a good format for a bar-char like graph like that
2- I want all of my data-points to be present in X axis(date), while R aggregates it to each year only (so I inly have four data-points there). Below is the current command that I am using:
plot(df$date,df$domain_count,col="red",type="h")
and my current plot is like this:
Ok, I see some issues in your original data. May I suggest the following:
Add the days in your date column
df$date=paste(df$date,'.01',sep='')
Convert the date column to be of date type:
df$date=as.Date(df$date,format='%Y.%m.%d')
Plot the data again:
plot(df$date,df$domain_count,col="red",type="h")
Also, may I add one more suggestion, have you used ggplot for ploting chart? I think you will find it much easier and resulting in better looking charts. Your example could be visualized like this:
library(ggplot2) #if you don't have the package, run install.packages('ggplot2')
ggplot(df,aes(date, count))+geom_bar(stat='identity')+labs(x="Date", y="Count")
First, you should transform your date column in a real date:
library(plyr) # for mutate
d <- mutate(d, month = as.numeric(gsub("[0-9]*\\.([0-9]*)", "\\1", as.character(date))),
year = as.numeric(gsub("([0-9]*)\\.[0-9]*", "\\1", as.character(date))),
Date = ISOdate(year, month, 1))
Then, you could use ggplot to create a decent barchart:
library(ggplot2)
ggplot(d, aes(x = Date, y = count)) + geom_bar(fill = "red", stat = "identity")
You can also use basic R to create a barchart, which is however less nice:
dd <- setNames(d$count, format(d$Date, "%m-%Y"))
barplot(dd)
The former plot shows you the "holes" in your data, i.e. month where there is no count, while for the latter it is even wuite difficult to see which bar corresponds to which month (this could however be tweaked I assume).
Hope that helps.

Resources