I want to create multi-layer plots using for-loops. The main dataframe I am working with has the following characteristics:
product: 55_ab_LL_bubbles_D1 | 55_ab_LL_troubles_D1 | 34_ac_LL_bubbles_D1 | 34_ac_LL_troubles_D1
Color
Blue 453.3 766.1 562.1 883.3
Green 775.5 897.1 434.5 983.4
Purple 883.4 445.7 787.2 555.5
Yellow 764.1 445.6 887.3 673.5
From the code below, I am running loop down the rownames (Color) to create a scatter plot.
What I would like to do is, not only run the current loop down the rownames, but I also want to create individual scatter plots for each product (based on the first string "55_", "34_" etc..). I want to group all data points for the number preceding in the product, create the independent scatter plots for each of these numbers for each of the colors. So instead of the four scatter plots it gives me right now (one for each color), I would like to have 8 (for each color and each product number).
Any suggestion is appreciated :) !
CODE:
pdf("scatterplot.pdf")
for(i in seq_len(nrow(data))){
df <- data.frame(x= data[i, grep("bubbles_", colnames(data))],
y= data[i, grep("troubles_",colnames(data))])
plot(df$x, df$y,
xlim=xy, ylim = xy)
}
dev.off()
Related
I'm trying to get the colored plots, basing on the column "cluster".
#kmeans (1mln: 8-10clusters)
km.out<-kmeans(kpis_scaled, centers=10,nstart=50)
print(km.out)
# attaching the cluster number to initial datasets
main_kpis_num$cluster <- as.numeric(km.out$cluster)
main_kpis$cluster <- as.factor(km.out$cluster)
library(PerformanceAnalytics)
chart.Correlation(main_kpis_num, col = main_kpis_num$cluster, pch=21)
played with col, bg=, fg= parameters, but in any case I'm getting pure black for all my dots (without split by cluster color) pairs.
My resulting graph is this:
https://disk.yandex.ru/i/QPqqF6jGGsCyXg
while my final target is to have the dots colored by segment size, like in this pretty example:
https://i0.wp.com/i.imgur.com/CD5gO.png?zoom=1.5&w=578
My sample is:
https://disk.yandex.ru/d/qlizyBH8-Y-MTA
Can you please help me in finding the mistake?
I have a problem, might be a bug in heatmaply or plotly. Colors in the sidebar of a heatmap are not showing the colors I specified. See the code example below, At the end of the code in part # 6) the first plot, plotted using the plot function (simple plot showing the colors), shows the colors correctly (yellow and blue):
The second plot using these colors in a heatmaply side bar (heatmamply side bar with wrong color):
fails to show them correctly and instead what appears to show random colors. In a similar plot with real data there are even red and orange colors in the sidebar (heatmaply sidebar shows red and orange while color range is blue-yellow):
while all codes are generated using a blue yellow color range. Any ideas what might cause this bug and how to show colors in the sidebar consistent with their color code a?
Compare cophenetic similarity between leaves in two trees build on full data and subsample of the data
# 1 ) Generate random data to build trees
set.seed(2015-04-26)
dat <- matrix(rnorm(100), 10, 50) # Dataframe with 50 columns
datSubSample <- dat[, sample(ncol(dat), 30)] #Dataframe with 30 columns sampled from the dataframe with 50
dat_dist1 <- dist(datSubSample)
dat_dist2 <- dist(dat)
hc1 <- hclust(dat_dist1)
hc2 <- hclust(dat_dist2)
# 2) Build two dendrograms, one based on all data, second based a sample of the data (30 out of 50 columns)
dendrogram1 <- as.dendrogram(hc1)
dendrogram2 <- as.dendrogram(hc2)
# 3) For each leave in a tree get cophenetic distance matrix,
# each column represent distance of that leave to all others in the same tree
cophDistanceMatrix1 <- as.data.frame(as.matrix(cophenetic(dendrogram1)))
cophDistanceMatrix2 <- as.data.frame(as.matrix(cophenetic(dendrogram2)))
# 4) Calculate correlation between cophenetic distance of a leave to all other leaves, between two trees
corPerLeave <- NULL # Vector to store correlations for each leave in two trees
for (leave in colnames(cophDistanceMatrix1)){
cor <- cor(cophDistanceMatrix2[leave], cophDistanceMatrix1[leave])
corPerLeave <- c(corPerLeave, unname(cor))
}
# 5) Convert cophenetic correlation to color to show in side bar of a heatmap
corPerLeave <- corPerLeave / max(corPerLeave) #Scale 0 to 1 correlation
byPal <- colorRampPalette(c('yellow', 'blue')) #blue yellow color palette, low correlation = yellow
colCopheneticCor <- byPal(20)[as.numeric(cut(corPerLeave, breaks =20))]
# 6) Plot heatmap with dendrogram with side bar that shows cophenetic correlation for each leave
row_dend <- dendrogram2
x <- as.matrix(dat_dist2)
#### Plot belows use the same color code, normal plot works, however heatmaply shows wrong colors
plot(x = 1:length(colCopheneticCor), y = 1:length(colCopheneticCor), col = colCopheneticCor)
heatmaply(x, colD = row_dend, row_side_colors = colCopheneticCor)
Found the solution, you can use a function for the color with the heatmaply build in row_side_palette parameter. Minimal example code, that can be combined with the code in the question itself to show heatmap with cophenetic distance per leave/species in the sidebar represented by a different color:
ByPal <- colorRampPalette(c('red','blue')) # Bi color palette function to be used in sidebar
heatmaply(m,colD = row_dend, file=fileName1, plot_method= "plotly",colorscale='Viridis',row_side_palette= byPal ,
row_side_colors=data.frame("Correlation cophenetic distances" = corPerLeave, check.names=FALSE))
One problem I did not solve yet is how to show a continuous colorbar in the legend, any suggestions?
I am using Julia for Financial Data Processing and then plotting graphs based on the financial data.
on X-Axis of graph I am plotting dates (per day prices)
on Y-Axis I am plotting Stock Prices, MovingAverage13 and MovingAverage21
I am currently using DataFrames to plot the data
Code-
df=DataFrame(x=dates,y1=pricesClose,y2=m13,y3=m21)
l1=layer(x="x",y="y1",Geom.line,Theme(default_color=color("blue")));
l2=layer(x="x",y="y2",Geom.line,Theme(default_color=color("red")));
l3=layer(x="x",y="y3",Geom.line,Theme(default_color=color("green")));
p=plot(df,l1,l2,l3);
draw(PNG("stock.png",6inch,3inch),p)
I am Getting the graphs correctly but I am not able to add a Legend in the Graph that shows
blue line is for Close Prices
red line is for moving average 13
green line is for moving average 21
How can we add a legend to the graph?
I understand from the comments in this link that currently it is not possible to get a legend for a list of layers.
Gadfly is based on Hadley Wickhams's ggplot2 for R and thus the usual pattern is to arrange data into a DataFrame with a discrete column for labelling purposes. In your case, this approach would look like:
x = 1:10
df1 = DataFrame(x=x, y=2x, label="double")
df2 = DataFrame(x=x, y=x.^2, label="square")
df3 = DataFrame(x=x, y=1./x, label="inverse")
df = vcat(df1, df2, df3)
p = plot(df, x="x", y="y", color="label", Geom.line,
Scale.discrete_color_manual("blue","red", "green"))
draw(PNG("stock.png", 6inch, 3inch), p)
Now you can try with manual_color_key.
The only change in your code is needed here:
p=plot(df,l1,l2,l3,
Guide.ylabel("Some text"),
Guide.title("My title"),
Guide.manual_color_key("Legend", ["I'm blue l1", "I'm red l2", "I'm green l3"], ["blue", "red", "green"]))
I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?
I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer
It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))
I have a matrix of 12 columns, and I am using boxplot function in R to plot the boxplot.
following commands are used:
pdf("data.pdf")
data<-read.table("data1", header=T)
boxplot(data, outline=F)
dev.off()
What I want, is to present the first three boxplots in red, green, and blue. while the next three in yellow, next three in orange and next three in purple.
How can I do this?
Thank you
To get colours, you just need to pass a vector of colours to the boxplot function:
##Create some dummy data
runif(10*12), ncol=12)
##Create a vector of 12 colours
cols = rep(c("yellow", "orange", "purple"), each=3)
cols = col=c("red", "green","blue",cols)
##Plot as normal
boxplot(dd, col=cols)
BTW, don't load your data at every iteration of your for loop. Load it once:
data <- read.table("data1", header=T)
pdf("data.pdf")
boxplot(data, outline=F)
dev.off()