Set a column as root of data frame - r

Hi I have a data which is a binary matrix and I have generated a cluster dendrogram with the hclust function in R. First I normalise the values and then I plot. This is the code:
mat.norm <- t(df / sqrt(2*rowSums(df)))
plot(hclust(dist(mat.norm, "euclidean")))
My data consist of 9 columns and the dendrogram is plotted for all the values of the 9 columns. Does anybody know if it possible to set one of those column as the root of the dendrogram from where all the other columns will be clustered?

Related

Make a tile graph in R but with a color key for each column

I have a dataframe composed of 4 categorical variables and 100 observations and would like to plot them as column tiles with colors corresponding to the levels of each variable. How could I do that?
Here is an example data frame:
df3=data.frame(
var1=rep(c(1,2),length(LETTERS)),
var2=rep(c(1,2,3),length(LETTERS)),
var3=rep(c(1,2,3,4,5,6),length(LETTERS)))
In theory, I think I should get values of vars 1-3 assigned each one to a row (letter) and column coordinates. Then, use ggplot2 to plot the tile graph like this:
d = data.frame(row = factor(c(row(df3))),
column = factor(c(col(df3))),
value = c(as.matrix(df3)))
ggplot(d,aes(x=column,y=row,fill=value))+geom_tile()
However,apart of not being sure if i created these coordinates well, I also do not see how to have an independent color key for each original df3's column, in the tile graph.

How to build a plot that evaluates the Xth percentile in R

I have a single column data frame of race times called RaceTimes, with the data frame called R5.
I don't know what ggplot to use and how to get the percentile on the x-axis.

Line diagram using unique combinations of data frame

I have one response variable which is continous and six output variables. Out of six output variables four of them are categorical.
I needed to plot a line diagram with the every unique combinations of data frame consisting from categorical variables with respect to x and y.
Please help me.

How to create a data.frame with a matrix as a variable in R?

My data frame has a first column of factors, and all the other columns are numeric.
Origin spectrum_740.0 spectrum_741.0 spectrum_742.0 etc....
1 Warthog 0.6516295 0.6520196 0.6523843
2 Tiger 0.4184067 0.4183569 0.4183805
3 Sperm whale 0.9028763 0.9031688 0.9034069
I would like to convert the data frame into two variables, a vector (the first column) and a matrix (all the numeric columns), so that I can do calculations on the matrix, such as applying msc from the pls package. Basically, I want the data frame to be like the gasoline data set from pls, which has one variable as a numeric vector, and a second variable called NIR as a matrix with 401 columns.
Alternatively, if you have any suggestions for applying calculations to the numeric data while keeping the Origin column connected, that would work too, but all the examples I have seen use gasoline or similarly formatted data frames to do the calculations on the NIR matrix.
Thank you!
M = as.matrix(df[,-1])
row.names(M) = df[,1]
M
spectrum_740.0 spectrum_741.0 spectrum_742.0
Warthog 0.6516295 0.6520196 0.6523843
Tiger 0.4184067 0.4183569 0.4183805
Sperm_whale 0.9028763 0.9031688 0.9034069

calculate frequency, separate and transpose column that have two factor variable in R

This is my data https://www.dropbox.com/s/msf0ro8saav7wbl/data1.txt?dl=0 (dataA), i want to extract "Habitat" to have frequency table so that i can calculate any statistical analysis such as mean and variance, and also to plot such as boxplot using ggplot2
I tried to use solution in duplicate question here R: How to get common counts (frequency) of levels of two factor variables by ID Variable (as new data frame) but i think it does not help my problem
Here's the easiest way to get a data.frame with frequencies using table. I'm using t to transpose and as.data.frame.matrix to transform it into a data.frame.
as.data.frame.matrix(t(table(data1)))
A B C
Adult 1 2 1
Juvenile 2 0 0

Resources