Choose which factor levels to plot - r

I'm wondering what the best way is to eliminate certain factors from a plot in ggplot.
I have data that looks something like:
Group Time Freq
A 1 5000
B 1 70
C 1 60
...
I'm then using geom_path to plot how these frequencies change over time.
For all of the time periods, group A has far more observations than the other groups, so I'd like to create some graphs that do not include group A.
I'm wondering what the best way to do that is. Is there something I can pass to ggplot?

Simplest thing is to filter the dataframe:
df[df$Group != "A",]
Or
subset(df, group!="A")

Related

How to order observations in R for a graphic?

I would like to ask you how can I order the observations in one variable- needing it for my graphic. Now, the observations are sorted by 1 to 5 and I need to do a rank by 5,3,1,2,4
For more understanding: This is the x- axis of my graphic, I make a discrete geom_bar and need this ranging for better visualizing the data (y-axis is only count)
Thankful for every help!
{ggplot2} will reorder numeric and character data. In order to impose an order on your data, you need to
convert it to an ordered factor, and
impose your desired order.
Luckily this is very easy in a single step using the reorder function:
observations = reorder(1 : 5, c(5, 3, 1, 2, 4))
I understand that you have a vector of observations - this should do the trick when "observations" is your vector of values:
observations <- 1:5 # example data
new_order <- observations[c(5,3,1,2,4)]
new_order
5 3 1 2 4

calculate frequency, separate and transpose column that have two factor variable in R

This is my data https://www.dropbox.com/s/msf0ro8saav7wbl/data1.txt?dl=0 (dataA), i want to extract "Habitat" to have frequency table so that i can calculate any statistical analysis such as mean and variance, and also to plot such as boxplot using ggplot2
I tried to use solution in duplicate question here R: How to get common counts (frequency) of levels of two factor variables by ID Variable (as new data frame) but i think it does not help my problem
Here's the easiest way to get a data.frame with frequencies using table. I'm using t to transpose and as.data.frame.matrix to transform it into a data.frame.
as.data.frame.matrix(t(table(data1)))
A B C
Adult 1 2 1
Juvenile 2 0 0

Plotting two vectors of different sizes, only want to plot matching column names

I'm in grad school and am on the steep R learning curve at the moment.
I'm trying to plot data from a clinical study that uses randomized patient IDs which I have set up as the column names. The study had two parts and some patients participated in only one part, X, and then dropped out so X has more values than Y. Thus, I want to make a scatterplot of X vs Y using only the patients that participated in both studies. The patient numbers are quite large so I cant just manually remove patients that only did part X.
Any help?
Basically my two sets of data look like this:
X
:
ID1 ID2 ID3 ID4
1 2 3 4
Y:
ID2 ID4
6 7
I've tried to do something like:
plot(x[,colnames(y)],y)
To try and only plot the values from X that have the column names in Y but this obviously didn't work.
Thanks in advance!!

Plot in R with one column having more than 1 occurence of values

Basically, I have 3 columns Id, A and B.
Data contains for same value of Id different A and B values.
I need to plot between A and B for same Id.
Any idea how to do it in R ?
You mean this basic graph:
library(ggplot2)
data<-data.frame(ld=c(1,2,3),A=c(10,20,30),B=c(100,200,300))
ggplot(data,aes(x=ld))+
geom_point(aes(y=A),col="brown1",size=5)+
geom_point(aes(y=B),col="blue",size=5)+
ylab("A-B")

How do I do a Barplot of already tabled data?

I have input data with very many observations per day.
I can make the barplot I want with 1 day by using table().
If I try to load more than 1 day and call table() I run out of memory.
I'm trying to table() each day and concatenate the tables into totals I can then barplot later. But I just cannot work out how to take the already tabled data and barplot each day as a stacked column.
After looping and consolidating I end up with something like this: 2 days of observations. (the Freq column is the default from the previous table() calls)
What is the best way to do a stacked barplot when my data ends up like this?
> data.frame(CLIENT=c("Mr Fluffy","Peppa Pig","Mr Fluffy","Dr Who"), Freq=c(18414000,9000000,7000000,15000000), DAY=c("2011-11-03","2011-11-03","2011-11-04","2011-11-04"))
CLIENT Freq DAY
1 Mr Fluffy 18414000 2011-11-03
2 Peppa Pig 9000000 2011-11-03
3 Mr Fluffy 7000000 2011-11-04
4 Dr Who 15000000 2011-11-04
>
> # What should I put here?
I'm assuming that you are using base graphics since you mention barplot. Here is an approach using that:
wide <- reshape(dat, idvar="CLIENT", timevar="DAY", direction="wide")
barplot(as.matrix(wide[-1]), beside=FALSE)
Alternatively, using ggplot2:
library("ggplot2")
ggplot(dat, aes(x=DAY, y=Freq)) +
geom_bar(aes(fill=CLIENT), position="stack")
Try ggplot2:
ggplot(df,aes(DAY,fill=CLIENT,weight=Freq))+geom_bar()
Shamelessly ripped from here:
http://had.co.nz/ggplot2/geom_bar.html

Resources