Vizualize two step cluster using ggplot2 [duplicate] - r

This question already has an answer here:
Reversed order of bars in grouped barplot with coord_flip
(1 answer)
Closed 2 years ago.
From spss there is a kind of clustering which is called two step cluster.
The vizual option is provided by spss is something like this left side plot.
Having the results of clusters, label/names of the variables used and their score into a dataframe like this
data.frame(cluster = c(1,1,1,2,2,2,3,3,3), value = c("Google","Amazon","Yahoo","Google","Amazon","Yahoo","Google","Amazon","Yahoo"), score = c(2194.2,43.2,4331.3,31.3,133.1,432.1,3234.1,44.3,21.4))
These are the inputs as refered in the spss plot.
is there any efficient way to vizualize them using ggplot2?

Maybe something like this:
library(ggplot2)
#Plot
ggplot(df,aes(x=cluster,y=score,fill=value))+geom_bar(stat='identity',position = 'stack')+
coord_flip()

Related

Issues creating a line chart in R. Group aesthetic error [duplicate]

This question already has an answer here:
ggplot each group consists of only one observation
(1 answer)
Closed 8 months ago.
Below is the sample data. Trying to have two lines with different colors. Seems pretty simple but running into the below error. Have two questions. First, how do I get around this error. Second, how would I edit the legend to where it says "Hires" instead of "HI".
"geom_path: Each group consists of only one observation. Do you need to
adjust the group aesthetic?"
library(ggplot2)
measure <- c("HI","HI","HI","HI","HI","JO","JO","JO","JO","JO")
date <- c("2002-01","2002-02","2002-03","2002-04","2002-05","2002-01","2002-02","2002-03","2002-04","2002-05")
value <- c(100,105,95,145,110,25,35,82,75,90)
df <- data.frame(measure,date,value)
graph <- df %>% ggplot (aes(x=date, y= value, color = measure)) + geom_line () + theme (legend.position = "bottom",legend.title = element_blank())
print(graph)
It's asking for a group, so you can give it a group:
ggplot(aes(x=date, y=value, group=measure, color=measure))
It's a bit surprising that it's not already grouped, and I'm not exactly sure why, but the above change appears to produce the result you want:
If you're interested in why it's asking for a group, I'd recommend simplifying and reformatting your example, and then asking as a separate question.

How to make a single histogram from 3 columns in R? [duplicate]

This question already has answers here:
Plotting two variables as lines using ggplot2 on the same graph
(5 answers)
Closed 5 years ago.
This is my first time using ggplot2. I have a table of 3 columns and I want to plot frequency distribution of all three columns in one figure. I have only used hist() before so I am a little lost on this ggplot2. Here is an example of my table. Tab separated table with 3 columns A,B,C headers.
A B C
1.38502 1.38502 -nan
0.637291 0.753084 1.55556
0.0155242 0.0164394 -nan
3.29355 1.15757 -nan
1.00254 1.10108 0.132039
0.0155424 0.0155424 nan
0.760261 0.681639 0.298851
1.21365 1.21365 -nan
1.216 1.22541 -nan
0.61317 0.738528 0.585657
0.618276 0.940312 0.820591
1.96779 1.31051 1.58609
0.725413 2.29621 1.78989
0.684681 0.67331 0.290221
I have used the following code by looking up similar posts but I end up with error.
library(ggplot2)
dnds <- read.table('dNdS_plotfile', header =TRUE)
ggplot(data=dnds, melt(dnds), aes_(value, fill = L1))+
geom_histogram()
ERROR:No id variables; using all as measure variables
Error: Mapping should be created with aes() or aes_().
I am really lost on how to solve this error. I want one figure with three different colored histograms that do not overlap in my final figure. Please help me achieve this. Thank you.
This should accomplish what you're looking for. I like to load the package tidyverse, which loads a bunch of helpful packages like ggplot2 and dplyr.
In geom_histogram(), you can specify the bindwidth of the histograms with the argument binwidth() or the number of bins with bins(). If you also want the bars to not be stacked you can use the argument position = "dodge".
See the documentation here: http://ggplot2.tidyverse.org/reference/geom_histogram.html
library(tidyverse)
data <- read.table("YOUR_DATA", header = T)
graph <- data %>%
gather(category, value)
ggplot(graph, aes(x = value, fill = category)) +
geom_histogram(binwidth = 0.5, color = "black")

Sorting DataFrame for ggplot barplot [duplicate]

This question already has answers here:
Order Bars in ggplot2 bar graph
(16 answers)
Closed 6 years ago.
I have a data frame df1 and want to draw a barplot of AccountExecutive and their corresponding ClearRate where the bars are arranged so that it is decreasing from left to right.
I tried this code but the resulting graph still reflects AccountExecutive order as it appears in df1
ggplot(arrange(df1, -ClearRate), aes(x = AccountExecutive, y = ClearRate)) +
geom_bar(stat="identity")
Can anyone help me correcting this code?
NOTE: Not a duplicate of the previous question because that one asks for an arbitrary positioning of the x axis labels. This question asks how to sort x-axis labels considering their y-axis values.
Try this one the code below should reorder AE according to clearance rate
ggplot(df1,aes(x=reorder(AccountExecutive,-ClearRate),y=ClearRate))+geom_bar(stat"identity")
here is the more about reorder function
Reorder bars in geom_bar ggplot2

same bar width in ggplot2? [duplicate]

This question already has answers here:
A way to always dodge a histogram? [duplicate]
(2 answers)
Closed 8 years ago.
In this example:
library(ggplot2)
dat <- data.frame(a=factor(c(1,1,1,2,2,2,3,3,3,4)), b=c("A","B","D","A","B","C","A","B","D",NA), c=c(1,4,3,5,5,1,2,2,8,6))
plot <- ggplot(dat,aes(fill=b,x=a,y=c))
plot + geom_bar(width=.7, position=position_dodge(width=.7), stat = "identity")
factor 4 is wider than the other bars. Is there a way to make them all the same width?
Ideally you should have data for every combination even if it is zero. That means, with 1 in data$a you should have data all the four(A,B,C,D) and so on... try modifying your data frame like this and plot. NA category was referred to as "other" here.
library(ggplot2)
dat <- data.frame(a=factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)),
b=c("A","B","C","D","other","A","B","C","D","other","A","B","C","D","other","A","B","C","D","other"),
c=c(1,4,0,3,0,5,5,1,0,0,2,2,0,8,0,0,0,0,0,6))
plot <- ggplot(dat,aes(fill=b,x=a,y=c))
plot + geom_bar(width=.7, position=position_dodge(width=.7), stat = "identity")
View this dataframe you will know the difference. You will obviously have missing bars corresponding to your data, which dnt look good. But im afraid this might be the only solution.

Facet for continuous variables in ggplot2 [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
ggplot - facet by function output
ggplot2's facets option is great for showing multiple plots by factors, but I've had trouble learning to efficiently convert continuous variables to factors within it. With data like:
DF <- data.frame(WindDir=sample(0:180, 20, replace=T),
WindSpeed=sample(1:40, 20, replace=T),
Force=sample(1:40, 20, replace=T))
qplot(WindSpeed, Force, data=DF, facets=~cut(WindDir, seq(0,180,30)))
I get the error : At least one layer must contain all variables used for facetting
I would like to examine the relationship Force~WindSpeed by discrete 30 degree intervals, but it seems facet requires factors to be attached to the data frame being used (obviously I could do DF$DiscreteWindDir <- cut(...), but that seems unecessary). Is there a way to use facets while converting continuous variables to factors?
Making an example of how you can use transform to make an inline transformation:
qplot(WindSpeed, Force,
data = transform(DF,
fct = cut(WindDir, seq(0,180,3))),
facets=~fct)
You don't "pollute" data with the faceting variable, but it is in the data frame for ggplot to facet on (rather than being a function of columns in the facet specification).
This works just as well in the expanded syntax:
ggplot(transform(DF,
fct = cut(WindDir, seq(0,180,3))),
aes(WindSpeed, Force)) +
geom_point() +
facet_wrap(~fct)

Resources