Remove NA values for bar plot in ggplot2. - r

The issue is pretty straightforward.
I am trying to generate a bar plot containing GDP per capita for several countries. The data is incomplete as some of the values are missing.
For instance, for year 1960, I only have data for 3 countries.
When I plot the data, the graph that is returned includes NAs, which I would like to exclude from the plot.
Please see an example below:
GDP <- ggplot(data, aes(country,1960)) + geom_bar(stat = 'identity', na.rm = T)
Please note that in this case, each column represents a year. The first column of the data frame consists of the name of the countries.
If it helps, please see the attached screenshot.
Thanks.

Related

How to plot a gg barplot for a single factor column?

My data frame has 621 rows and each column describes something about it. I'm trying to do a exploratory data analysis where I plot out all the data into a bar plot.
I have a factor column called phenotype, which has 86 levels which describe the main condition in my cohort. I want to plot this out as 86 separate bar plots, each with the total number of people who have that condition on ggplot.
I've attached a screenshot of my data below, I basically want the x axis to have the condition name like the 'Bardet-Biedl Syndrome', 'Classic Ehlers Danlos Syndrome' etc and on the y axis the number of people who have that condition, such as 3,4,5 as displayed below etc. I got the below data by basically doing
table(data.frame$Phenotype)
I'm using the below code to generate my ggplot
ggplot (tiering, aes(x = Phenotype, y = count(tiering$Phenotype))) +
theme bw() +
geom bar(stat = "identity")
I'm sure the answer is out there, but I've looked on the R help websites and I can't seem to figure this out, so would be very grateful for the help.
EDIT: I got to a marplot with the help of the below code, just trying to reorder the bar/columns in decreasing order and tried this method but it hasn't worked. Would anyone have any suggestions?

Density plot for multiple group shows one line, however legend shows 3

I am analyzing US election data volume from Google trend. I type the below command in R studio.
The poliData dataframe contains the SearchVolume for all months for three Politicians.
ggplot(data = poliData, aes(x=Date, group=Politician, colour=Politician)) +
geom_density()
But I only get the density line (blue) for one politician only with the above command.See the attached picture. Can you please help
I guess you got three lines on top each other because Date variable values are the same for all three politicians. My understanding of your analysis could be something like this:
ggplot(data = poliData,
aes(x=Date, colour=Politician,
weight = SearchVolume/sum(SearchVolume))) +
geom_density()
Adding weight should produce distinct lines for different politicians. If this is not what you wanted, please dput your data for others to work out a solution for you. Also, as I do not have the data, I have not tested the above code yet. Please let me know if it does not work.

Graph Creation in r

I am trying to calculate the city wise spend on each product on yearly basis.Also including graphical representation however I am not able to get the graphs on R?
Top_11 <- aggregate(Ca_spend["Amount"],
by = Ca_spend[c("City","Product","Month_Year")],
FUN="sum")
A <- ggplot(Top_11,aes(x=City,Month_Year,y=Amount))
A <-geom_bar(stat="identity",position='dodge',fill="firebrick1",colour="black")
A <- A+facet_grid(.~Type)
This is the code I am using.I am trying to plot City,Product,Year on same graph.
VARIABLES-(City product Month_Year Amount)
(OBSERVATIONS)- New York Gold 2004 $50,0000 (Sample DATA Type)
I'd try this:
ggplot(Top_11,aes(x=City, fill = Product, y=Amount)) +
geom_col() +
facet_wrap(~Month_Year)
For your 5 rows of sample data, that gives the graph below. You can play around with which variable goes to fill (fill color), x (x-axis), and facet_wrap (for small multiples). I see in your code you tried facet_grid(.~Type), but that won't work unless you have a column named Type.

I am using ggplot2 to make a bar chart and can't get the years correct along the x-axis

I am using ggplot2 to make a bar chart of the number of participants per year by gender. If I have 14 years included, I would like 2 bars for each year corresponding to the number of males and females for that year. I am not getting each year along the x-axis. I think data is being binned. I have tried changing the bin width, using scale_x_date and am still stuck.
Can you help me figure out how to have the data for EACH year in my graph?
As an example, here is my data for years 2004-2017:
year=c(2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017)
gender=c("male" , "female")
Participants is by gender, male then female respectively per year:
Participants=c(1307,443,1847,630,2109,765, 1824,691,2250,952,3123,1421,4097,1904,6415,3284,8788,4678,11581,6694,13141,8478,16389,10575,20990,13811,26951,19729)
data=data.frame(year,gender,Participants)
Here is how I am trying to generate my plot:
MyPlot <- ggplot(data, aes(fill=gender, y=Participants, x=year)) +
geom_bar(position="dodge", stat="identity",width = .8)
print(MyPlot + ggtitle("Annual Number of Participants by Gender"))
On the x-axis, the years 2006, 2010, 2014 and 2018 are marked and the bars correspond to data from two years. I want data for each year, both in terms of the bars and in terms of the ticks on the x-axis.
Any help would be appreciated!
You have more participants than years, so you don't have a clear dataframe design to serve as an input to ggplot.
Start here:
Read this: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
The key to which is:
Each variable forms a column.
Each observation forms a row.
Each type of observational unit forms a table.
Then once you have a tibble/data frame your ggplot2 code should work fine. I'd kill the width= option until you have it working.

geom_density(aes(y=..count..)) plot for multiple groups show a wrong x-axis count

My data frame (df) consists of 5 columns with 2,000 numerical values for each one.
Using reshape I reformatted my data frame to two columns: 1st containing the values (df$Values) (a total of 10,000) and a 2nd containing the name of the column (df$Labels) from where the value in col 1 is coming from.
I will use the 2nd column as a group factor.
I generated a mycolor and myshapes for coloring and setting the shape of lines.
With ggplot I tried to generate a density plot containing the density plot for the five factors.
The problem is that the x-axis show the counts, which maximum is 10,000. This value does not make any sense because the maximum possible counts for each plot must be 2,000. Anyone knows what is going on? Which is code I need to use to properly correct the x-axis?
ggplot2, geom_density() plot:
Here is the code:
ggplot(df, aes(x=Values, colour=Labels, linetype=Labels))+
geom_density(aes(y=..count..))+
theme_classic()+
scale_colour_manual(values = mycolor)+
scale_linetype_manual(values = myshapes)+
ggtitle("Title")+
scale_x_continuous(limits = c(0.5,1.5))

Resources