I want to read data to r from clipboard but the data dimension is wrong. The question is how I can read data from clipboard correctly and how can I distinguish the data separator.
My data is this
group month Estimate lwr upr
placebo 0 18.7 17.6 19.9
placebo 6 21.5 20.3 22.7
placebo 12 24.3 22.8 25.7
placebo 18 27.0 25.2 28.9
active 0 18.7 17.6 19.9
active 6 20.8 19.6 22.0
active 12 22.9 21.4 24.3
active 18 25.0 23.1 26.8
Code I tried is this
d1 <- read.delim('clipboard')
d2 <- readClipboard()
Related
Good evenning
In Rstudio
I have a problem multiplying these two matrices of a different size, and it becomes worse because I have to multiply in such a way that the values in the row d2$ID=1 have to multiply only the repetitions of w$sample=1.
sample and ID are indicating is the same sample
In other words, from the "subset" d2$ID=1, every single value ("L1", "ST", "GR", "CB", "HSK", "DDM") has to multiply the whole "subset" w$sample=1 (4 rows in this case, but not always), yes, all the values "G2", "G4", "G6", "G8", "G12"
>d2
ID L1 ST GR CB HSK DDM
1 1 0.1662000 0.2337000 0.3637000 0.11110000 0.10100000 0.024300000
2 2 0.1896576 0.2280830 0.3705740 0.09406879 0.09319434 0.024422281
3 3 0.1110259 0.2217769 0.4180797 0.11122498 0.10902635 0.028866094
4 4 0.1558785 0.2008862 0.4222565 0.09805538 0.10218119 0.020742172
5 5 0.1536421 0.1674096 0.4205395 0.14362176 0.08635519 0.028431849
6 6 0.1841964 0.1514189 0.4603306 0.10243621 0.08928011 0.012337688
> w
sample G2 G4 G6 G8 G12
1 1 10.9 15.9 21.4 28.0 37.8
2 1 11.5 16.6 22.2 29.5 38.3
3 1 10.3 15.1 20.7 28.3 36.7
4 1 11.7 18.1 24.8 31.2 39.5
5 2 11.0 16.8 22.4 30.6 38.0
6 2 10.1 15.9 22.5 30.2 36.7
7 2 12.8 17.8 22.8 28.7 37.1
8 2 11.8 16.3 20.8 27.3 34.7
9 2 11.9 16.7 21.6 28.3 34.6
10 3 12.0 18.1 24.2 30.9 40.0
11 3 12.2 17.7 24.2 31.7 40.5
12 4 11.1 16.5 22.7 31.0 39.2
13 4 12.5 19.8 27.4 32.8 38.8
14 4 12.4 19.2 25.8 33.0 39.9
15 4 12.4 19.2 26.2 33.4 38.9
16 4 13.4 18.3 23.7 30.0 38.2
17 5 13.3 18.6 24.0 30.7 38.4
18 5 13.3 18.1 22.9 30.1 36.8
19 5 13.7 19.9 26.5 33.8 43.0
20 5 12.7 18.2 24.6 32.5 41.3
21 6 12.1 17.5 24.3 33.7 42.2
22 6 14.5 20.8 28.4 35.3 43.7
I have check already a lot of questions but I can't figure it out, specially because most of the information is for matrices of the same size.
I tried by filtering the data from d2, but the data set is really big, then is really inefficient.
I am a beginner, if you consider is so easy I would appreciate at least a hint, please!
I have several data sets like these ones...
Thanks in advance!
This seems to perform as requested:
res <- apply(w, 1, function(x){ unclass(
outer(as.matrix( x[-1] ),
as.matrix( d2[1, c( "L1", "ST", "GR", "CB", "HSK", "DDM")])))})
str(res)
# result
# num [1:30, 1:22] 1.81 2.64 3.56 4.65 6.28 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:22] "1" "2" "3" "4" ...
I almost got it right on the first pass but after some debugging found that I needed to add the as.matrix call to both arguments inside outer (so to speak ;-). To explain my logic ... I wanted to run down each row of w with apply and then use match on the value of the first column (of each row of w) to the unique row of d2. The match function is designed for just this purpose, to return a suitable number to be used for indexing. Then with the rest of the row (x[-1] by the time it was passed through the function call), I would use outer on the row values crossed with the desired row and columns of d2. If you do it without the as.matrix calls you get an error message:
Error in tcrossprod(x, y) :
requires numeric/complex matrix/vector arguments
I don't think that's a very informative error message. Both of the arguments were numeric vectors.
I'm very new to R. I'm trying on the native boxplot function, using ~ shall combine different variables on the X axis.
My book gives two examples
boxplot(len ~ supp, data = ToothGrowth)
and
boxplot(len ~ supp + dose, data = ToothGrowth)
I do understand the first one, but what does + in boxplot(len ~ supp + dose, data = ToothGrowth) do? The output is confusing for me (shown below).
In the second instance len ~ sup + dose is the equivalent of doing:
TG_split <- with(
ToothGrowth,
split(len, list(supp, dose)
)
)
boxplot(TG_split)
i.e. it splits the len vector by the two factors supp and dose, and gives you the values of len for every combination of the two factors.
TG_split
$OJ.0.5
[1] 15.2 21.5 17.6 9.7 14.5 10.0 8.2 9.4 16.5 9.7
$VC.0.5
[1] 4.2 11.5 7.3 5.8 6.4 10.0 11.2 11.2 5.2 7.0
$OJ.1
[1] 19.7 23.3 23.6 26.4 20.0 25.2 25.8 21.2 14.5 27.3
$VC.1
[1] 16.5 16.5 15.2 17.3 22.5 17.3 13.6 14.5 18.8 15.5
$OJ.2
[1] 25.5 26.4 22.4 24.5 24.8 30.9 26.4 27.3 29.4 23.0
$VC.2
[1] 23.6 18.5 33.9 25.5 26.4 32.5 26.7 21.5 23.3 29.5
I have the following data.
HEIrank1
HEI.ID X2007 X2008 X2009 X2010 X2011 X2012
1 OP 41.8 147.6 90.3 82.9 106.8 63.0
2 MO 20.0 20.8 21.1 20.9 12.6 20.6
3 SD 21.2 32.3 25.7 23.9 25.0 40.1
4 UN 51.8 39.8 19.9 20.9 21.6 22.5
5 WS 18.0 19.9 15.3 13.6 15.7 15.2
6 BF 11.5 36.9 20.0 23.2 18.2 23.8
7 ME 34.2 30.3 28.4 30.1 31.5 25.6
8 IM 7.7 18.1 20.5 14.6 17.2 17.1
9 OM 11.4 11.2 12.2 11.1 13.4 19.2
10 DC 14.3 28.7 20.1 17.0 22.3 16.2
11 OC 28.6 44.0 24.9 27.9 34.0 30.7
12 TH 7.4 10.0 5.8 8.8 8.7 8.6
13 CC 12.1 11.0 12.2 12.1 14.9 15.0
14 MM 11.7 24.2 18.4 18.6 31.9 31.7
15 MC 19.0 13.7 17.0 20.4 20.5 12.1
16 SH 11.4 24.8 26.1 12.7 19.9 25.9
17 SB 13.0 22.8 15.9 17.6 17.2 9.6
18 SN 11.5 18.6 22.9 12.0 20.3 11.6
19 ER 10.8 13.2 20.0 11.0 14.9 14.2
20 SL 44.9 21.6 21.3 26.5 17.0 8.0
I try following commends to draw regression line for each HEIs.
year <- c(2007 , 2008 , 2009 , 2010 , 2011, 2012)
op <- as.numeric(HEIrank1[1,])
lm.r <- lm(op~year)
plot(year, op)
abline(lm.r)
I want to draw to draw regression line for each college in one graph and I do not how.can you help me.
Here's my approach with ggplot2 but the graph is uninterpretable with that many lines.
library(ggplot2);library(reshape2)
mdat <- melt(HEIrank1, variable.name="year")
mdat$year <- as.numeric(substring(mdat$year, 2))
ggplot(mdat, aes(year, value, colour=HEI.ID, group=HEI.ID)) +
geom_point() + stat_smooth(se = FALSE, method="lm")
Faceting may be a better way to got:
ggplot(mdat, aes(year, value, group=HEI.ID)) +
geom_point() + stat_smooth(se = FALSE, method="lm") +
facet_wrap(~HEI.ID)
I'm using R for the analysis of my master thesis
I have the following data frame: STOF: Student to staff ratio
HEI.ID X2007 X2008 X2009 X2010 X2011 X2012
1 OP 41.8 147.6 90.3 82.9 106.8 63.0
2 MO 20.0 20.8 21.1 20.9 12.6 20.6
3 SD 21.2 32.3 25.7 23.9 25.0 40.1
4 UN 51.8 39.8 19.9 20.9 21.6 22.5
5 WS 18.0 19.9 15.3 13.6 15.7 15.2
6 BF 11.5 36.9 20.0 23.2 18.2 23.8
7 ME 34.2 30.3 28.4 30.1 31.5 25.6
8 IM 7.7 18.1 20.5 14.6 17.2 17.1
9 OM 11.4 11.2 12.2 11.1 13.4 19.2
10 DC 14.3 28.7 20.1 17.0 22.3 16.2
11 OC 28.6 44.0 24.9 27.9 34.0 30.7
Then I rank colleges using this commend
HEIrank1<-(STOF[,-c(1)])
rank1 <- apply(HEIrank1,2,rank)
> HEIrank11
HEI.ID X2007 X2008 X2009 X2010 X2011 X2012
1 OP 18.0 20 20.0 20.0 20.0 20
2 MO 14.0 9 13.0 13.5 2.0 12
3 SD 15.0 16 17.0 16.0 16.0 19
4 UN 20.0 18 8.0 13.5 14.0 13
5 WS 12.0 8 4.0 7.0 6.0 8
6 BF 6.5 17 9.5 15.0 10.0 14
7 ME 17.0 15 19.0 19.0 17.0 15
8 IM 2.0 6 12.0 8.0 8.5 10
9 OM 4.5 3 2.5 3.0 3.0 11
10 DC 11.0 14 11.0 9.0 15.0 9
11 OC 16.0 19 16.0 18.0 19.0 17
I would like to draw histogram for each HEIs (for each row)?
If you use ggplot you won't need to do it as a loop, you can plot them all at once. Also, you need to reformat your data so that it's in long format not short format. You can use the melt function from the reshape package to do so.
library(reshape2)
new.df<-melt(HEIrank11,id.vars="HEI.ID")
names(new.df)=c("HEI.ID","Year","Rank")
substring is just getting rid of the X in each year
library(ggplot2)
ggplot(new.df, aes(x=HEI.ID,y=Rank,fill=substring(Year,2)))+
geom_histogram(stat="identity",position="dodge")
Here's a solution in lattice:
require(lattice)
barchart(X2007+X2008+X2009+X2010+X2011+X2012 ~ HEI.ID,
data=HEIrank11,
auto.key=list(space='right')
)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Anyone know of an R package for calculating partial R^2 in multiple regression? I've tried the command partial.R2 from package asbio, but it is giving error messages even with the example from supplied documentation.
Many thanks.
I've found out that command lm.sumSquares from package lmSupport provides by partial and semipartial correlations.
Data from 'Applied Linear Statistical Models' by John Neter, Michael H Kutner, William Wasserman, Christopher J. Nachtsheim
Section 7.4 in page 274:
# body fat example from Neter et al. via rhelp archives:
bf.dat <- read.table(text="x1 x2 x3 y
1 19.5 43.1 29.1 11.9
2 24.7 49.8 28.2 22.8
3 30.7 51.9 37.0 18.7
4 29.8 54.3 31.1 20.1
5 19.1 42.2 30.9 12.9
6 25.6 53.9 23.7 21.7
7 31.4 58.5 27.6 27.1
8 27.9 52.1 30.6 25.4
9 22.1 49.9 23.2 21.3
10 25.5 53.5 24.8 19.3
11 31.1 56.6 30.0 25.4
12 30.4 56.7 28.3 27.2
13 18.7 46.5 23.0 11.7
14 19.7 44.2 28.6 17.8
15 14.6 42.7 21.3 12.8
16 29.5 54.4 30.1 23.9
17 27.7 55.3 25.7 22.6
18 30.2 58.6 24.6 25.4
19 22.7 48.2 27.1 14.8
20 25.2 51.0 27.5 21.1 ", header=TRUE)
library(rms) # will also load Hmisc
fit <- ols(y ~ x1 + x2, data=bf.dat)
plt <- plot(anova(fit), what='partial R2')
plt
# x2 x1
#0.066955220 0.007010427