using lappy and elseif command - r

Using R I have a table, lets say 'locations'
head(locations, n=10)
apillar fender fwheel fdoor compart rdoor rwheel boot
1 0 0 0 0 0 0 0 1
2 0 0 0 1 0 0 0 0
3 0 0 0 0 1 0 0 0
4 0 1 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0
6 1 0 0 1 0 0 0 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 1 0 0 0 0
10 0 0 0 0 0 1 0 0
now i want to create a new variable "cat" which groups the impacts into category locations.
I have been using if, elseif and else command, but I cannot get it to work.
The command is:
cat <- lapply(locations, function(x) if (apillar|fender|fwheel == 1)print("front") else if (fdoor|compart|rdoor == 1)print("middle") else if(rwheel|boot ==1)print("rear") else print("NA")
such that cat should read rear, middle, middle, middle, front etc

When vectors of TRUE or FALSE statements are involved, I usually prefer not to work with if to avoid loops. I find conditional referencing to be more elegant in this case. See below.
locations <- read.table(header=TRUE, text=
"apillar fender fwheel fdoor compart rdoor rwheel boot
1 0 0 0 0 0 0 0 1
2 0 0 0 1 0 0 0 0
3 0 0 0 0 1 0 0 0
4 0 1 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0
6 1 0 0 1 0 0 0 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 1 0 0 0 0
10 0 0 0 0 0 1 0 0")
locations$cat <- NA
within(locations,{
cat[apillar|fender|fwheel] <- "front"
cat[fdoor|compart|rdoor] <- "middle"
cat[rwheel|boot] <- "rear"
})
Result:
apillar fender fwheel fdoor compart rdoor rwheel boot cat
1 0 0 0 0 0 0 0 1 rear
2 0 0 0 1 0 0 0 0 middle
3 0 0 0 0 1 0 0 0 middle
4 0 1 0 0 0 0 0 0 front
5 1 0 1 0 0 0 0 0 front
6 1 0 0 1 0 0 0 0 middle
7 0 0 0 0 0 0 0 0 <NA>
8 0 0 0 0 1 0 0 0 middle
9 0 0 0 1 0 0 0 0 middle
10 0 0 0 0 0 1 0 0 middle
Cheers!

Corrected your own code:
locations$cat= with(locations, ifelse(apillar|fender|fwheel, "front", ifelse(fdoor|compart|rdoor,"middle",ifelse(rwheel|boot, "rear", "NA"))) )
> locations
apillar fender fwheel fdoor compart rdoor rwheel boot cat
1 0 0 0 0 0 0 0 1 rear
2 0 0 0 1 0 0 0 0 middle
3 0 0 0 0 1 0 0 0 middle
4 0 1 0 0 0 0 0 0 front
5 1 0 1 0 0 0 0 0 front
6 1 0 0 1 0 0 0 0 front
7 0 0 0 0 0 0 0 0 NA
8 0 0 0 0 1 0 0 0 middle
9 0 0 0 1 0 0 0 0 middle
10 0 0 0 0 0 1 0 0 middle
>

Related

R: Simulating ERGM model in R then generate adjacency matrix of that model

I use library(ergm) and library(igraph) and generate a ERGM network. But I want the adjacency matrix of that network. I am unable to find any function which can produce that.
library(ergm)
library(igraph)
g.use <- network(16,density=0.1,directed=FALSE)
#
# Starting from this network let's draw 3 realizations
# of a edges and 2-star network
#
g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8,0.03),
basis=g.use, control=control.simulate(
MCMC.burnin=1000,
MCMC.interval=100))
#g.sim[[3]]
summary(g.sim)
Is it possible to find the adjacency matrix from g.sim? and how?
EGRM package uses the network package and not the igraph package. You should maintain everythig in network and not load igraph as the two have some conflicting functions with same names.
In your case, you simulate 3 graphs thus you should have 3 adjacency matrices. The code is as below:
library(ergm)
g.use <- network(16,density=0.1,directed=FALSE)
g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8,0.03),
basis=g.use, control=control.simulate(
MCMC.burnin=1000,
MCMC.interval=100))
The code you want:
lapply(g.sim, as.matrix)
[[1]]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0
3 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 1
4 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0
5 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0
6 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0
7 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1
8 0 1 0 0 0 0 0 0 0 1 1 1 1 0 1 0
9 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
10 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0
11 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0
12 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
13 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 1
14 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0
15 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
16 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0
[[2]]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
3 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0
4 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
6 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 1
7 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
8 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0
9 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
11 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0
12 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1
13 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
15 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
16 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0
[[3]]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1
2 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0
3 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0
4 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
5 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 1 0 0 0 0 1 0 1 0 0 0 1 0 1 0
7 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1
11 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1
12 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0
13 1 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0
14 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0
15 0 0 1 0 0 1 1 0 0 1 0 1 0 0 0 1
16 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0

R multiple for loop

I have this loop over the file msp.chr1
for(i in names(msp.chr1[c(7:70)])){
tmp <- rle(msp.chr1[[i]])$lengths
msp.chr1$idx <- rep(1:length(tmp),tmp)
tmp2 <- unlist(by(msp.chr1[msp.chr1[[i]]==1,], list(msp.chr1$idx[msp.chr1[[i]]==1]),function(x){tail(x["epos"],1)-head(x["spos"],1)}))
assign(paste(i, ".chr1", sep=""), as.vector(tmp2))
rm(i); rm(tmp); rm(tmp2)
}
This file is a dataframe with multiple columns:
head(msp.chr1)
chm spos epos sgpos egpos nsnps PDAC1.0 PDAC1.1 PDAC10.0 PDAC10.1 PDAC100.0 PDAC100.1 PDAC101.0 PDAC101.1 PDAC102.0 PDAC102.1 PDAC103.0 PDAC103.1
1 1 123492 134160 0.12 0.13 252 0 0 0 0 1 0 0 0 0 0 0 0
2 1 134160 135025 0.13 0.14 20 0 0 0 0 1 0 0 0 0 0 0 0
3 1 135025 145600 0.14 0.15 150 0 0 0 0 1 0 0 0 0 0 0 0
4 1 145600 316603 0.15 0.32 195 0 1 0 0 1 0 0 1 0 0 0 1
5 1 316603 520140 0.32 0.52 765 0 0 0 0 0 0 0 0 0 0 0 0
6 1 520140 667054 0.52 0.67 1080 0 0 0 0 0 0 0 0 0 0 0 0
PDAC104.0 PDAC104.1 PDAC105.0 PDAC105.1 PDAC11.0 PDAC11.1 PDAC12.0 PDAC12.1 PDAC13.0 PDAC13.1 PDAC14.0 PDAC14.1 PDAC15.0 PDAC15.1 PDAC17.0 PDAC17.1
1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1
2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1
3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1
4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PDAC18.0 PDAC18.1 PDAC19.0 PDAC19.1 PDAC2.0 PDAC2.1 PDAC20.0 PDAC20.1 PDAC21.0 PDAC21.1 PDAC22.0 PDAC22.1 PDAC23.0 PDAC23.1 PDAC24.0 PDAC24.1 PDAC25.0
1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0
2 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0
3 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
PDAC25.1 PDAC3.0 PDAC3.1 PDAC4.0 PDAC4.1 PDAC5.0 PDAC5.1 PDAC6.0 PDAC6.1 PDAC7.0 PDAC7.1 PDAC8.0 PDAC8.1 PDAC807.0 PDAC807.1 PDAC810.0 PDAC810.1
1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
PDAC9.0 PDAC9.1 idx
1 0 0 1
2 0 0 1
3 0 0 1
4 0 0 1
5 1 0 1
6 1 0 1
for(i in names(msp.chr1[c(7:70)])){
tmp <- rle(msp.chr1[[i]])$lengths
msp.chr1$idx <- rep(1:length(tmp),tmp)
tmp2 <- unlist(by(msp.chr1[msp.chr1[[i]]==1,], list(msp.chr1$idx[msp.chr1[[i]]==1]),function(x){tail(x["epos"],1)-head(x["spos"],1)}))
assign(paste(i, ".chr1", sep=""), as.vector(tmp2))
rm(i); rm(tmp); rm(tmp2)
}
But I actually have 23 files, of names msp.chr1, msp.chr2, ..., msp.chr23.
I want to add another loop on the above, to do that on all files at once.
I tried several things but it is not working...
Basically, every chr1 in my loop (including in the assign) should be replaced by chr1 to chr23.
Can you help?
Thanks,
You can generate the name of the file with paste, and then get the file by its name with get. A better option would be to create these files within a list, then you'd only use the j like df=list[[j]].
for(j in 1:23){
df = get(paste("msp.chr",j,sep=""))
for(i in names(df[c(7:70)])){
tmp <- rle(df[[i]])$lengths
df$idx <- rep(1:length(tmp),tmp)
tmp2 <- unlist(by(df[df[[i]]==1,], list(df$idx[df[[i]]==1]),function(x){tail(x["epos"],1)-head(x["spos"],1)}))
assign(paste(i, ".chr1", sep=""), as.vector(tmp2))
rm(i); rm(tmp); rm(tmp2)
}
}

Standard deviation error for EcoTest.sample

I am using EcoTest.sample to compare rarefaction curves for 19 vegetation plots on two soil types (alluvial and canyon). The code below produces the following
warning (more than 50 times): "In cor(x > 0) : the standard deviation is zero".
The test still produces all the expected output. Should I be concerned about the warnings? Is it a result of my relatively small sample size?
rawdata<-read.table(text="Plot SiteType sp1 sp2 sp3 sp4 sp5 sp6 sp7 sp8 sp9 sp10 sp11 sp12 sp13 sp14 sp15 sp16 sp17 sp18 sp19 sp20 sp21 sp22 sp23 sp24 sp25 sp26 sp27 sp28 sp29 sp30 sp31 sp32 sp33 sp34 sp35
2 canyon 1 0 1 0 1 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0
3 alluvial 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
5 alluvial 1 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
6 alluvial 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0
7 alluvial 1 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
8 alluvial 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0
10 alluvial 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0
11 canyon 1 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0
12 canyon 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
13 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0
14 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
15 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0
16 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
17 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
18 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0
19 canyon 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0
20 canyon 1 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1
22 alluvial 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 0
23 alluvial 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0
", header=T)
data<-rawdata[,-1]
rownames(data)<-rawdata[,1]
test.data<-EcoTest.sample(data[,-1], by=data$SiteType, MARGIN=1, trace=F)
EDIT: Perhaps you need to set the nature of the index using q. For instance if I use q=2 the inverse Simpson index, I cannot reproduce your error. As it stands you're using q=0, the species richness. Perhaps there's nothing to do rather than using a different index. I'm not aware of the factors affecting index choice. I've read a thing or two here: http://www.tiem.utk.edu/~gross/bioed/bealsmodules/shannonDI.html and found this paper that I didn't go into much detail: https://dx.doi.org/10.1002%2Fece3.1155
Using Simpson's index: No warnings.
test.data<-EcoTest.sample(data[,-1], by=data$SiteType, MARGIN=1, trace=F,q=2)
Sample-based method
P(Obs <= null) = 0.205
As stated in this answer on SE, a standard deviation of zero will have an impact on the nature of the distribution. Therefore, any tests you perform that may have depended on a normal distribution will likely be erroneous. The p-values obtained say by a t-test may therefore be "insignificant."
When standard deviation is zero, your Gaussian (normal) PDF turns into Dirac delta function. You can't simply plug zero standard deviation into the conventional expression. For instance, if the PDF is plugged into some kind of numerical integration, this won't work. (Aksakal on SE)
https://stats.stackexchange.com/questions/233834/what-is-the-normal-distribution-when-standard-deviation-is-zero

creating a larger matrix from smaller matrices in R

I have a series of text files in a folder called "Disintegration T1" which look like this:
> 1.txt
0 0 0 0 1
1 0 0 0 1
0 1 0 0 1
0 0 0 0 0
1 1 1 1 0
> 2.txt
0 1 1 0 1
0 0 1 1 1
1 1 0 1 1
1 1 1 0 1
0 0 0 0 1
> 3.txt
0 1 1 1
1 0 0 0
0 0 0 0
1 0 0 0
The files are all either 4X4 or 5X5. They must be read in as matrices, as the data is for social network analyses. My goal is to automate the process of putting these matrices into a larger matrix, so that these matrices are directly diagonal to each other, and 0s inputted in the blank spaces within the larger matrix. In this case the final result would look like:
> mega_matrix
0 0 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 0 0 0 0
0 1 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 0 1 1 0 0 0 0
0 0 0 0 0 1 1 1 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0
Thank you!
You want bdiag from the Matrix package:
library(Matrix)
bdiag(matrix1, matrix2, matrix3)
And to do the whole directory (thanks to #user20650 in the comments) :
bdiag(lapply(dir(), function(x){as.matrix(read.table(x))}))

How do I change the numbering of the x axis in R to just 2 values?

I want to create a histogram from my data set of the frequency of students who have had broken bones. The values are either 0 or 1.
I.E:
[1] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
[38] 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
[75] 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 1
[112] 1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0
[149] 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0
[186] 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
[223] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1
[260] 1 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
[297] 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1
[334] 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0
[371] 0 0 0 0 0 0 0 0 0 0 0 0
However the scale on the axis axis of the graph has increments of 0.2. I just want either 0 or 1 as the data is categorical. Would anyone please kindly tell me how to rectify this?
What you need is a combination of assigning the appropriate values to the breaks argument and the xaxp argument in ?hist. Consider:
# this just gives me your data:
my.data <- "
0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 1
1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0
0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0
0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0
0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1
0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0"
my.data <- unlist(strsplit(my.data, " "))
my.data <- gsub("\\n", "", my.data)
my.data <- as.numeric(my.data)
hist(my.data, breaks=c(-.5, .5, 1.5), xaxp=c(0,1,1))
breaks is used to define exactly 2 bins, and xaxp is used to change the number and placement of the tick marks on the x axis (for more on how xaxp works, see this excellent answer: R, change the spacing of tick marks on the axis of a plot?) Here is the resulting figure:
On a different note, it is not clear how informative a histogram is for data like this (or perhaps even ever, see: assessing-approximate-distribution-of-data-based-on-a-histogram on stats.SE). You might just was well try:
> table(my.data)
my.data
0 1
296 86

Resources