Why the labels are not arranged properly in `stars()` in R? - r

I am using following function to generate stars(), one the visualization technique for multivariate data.
library(randomNames)
set.seed(3)
Name = randomNames(50, which.names = 'first')
height = sample(160:180, 50, replace = TRUE)
weight = sample(45:85, 50, replace = TRUE)
tumour_size = runif(50, 0,1)
df = data.frame(Name, height, weight, tumour_size, rnorm(50, 10,3))
stars(df,labels = Name)
But, I get the output like this:
How to align the names exactly below the stars?

Use option flip.labels=FALSE.
stars(df, labels = Name, flip.labels = FALSE)
Result

Related

Automatic creation of barplots for all columns

For a very large table, I would like to automatically have a barplot for each column, which is then also saved as a png.
The title can simply be the column title and the description of the columns correspond to the column variables. So no individual edit of the barplots is necessary
I have already created barplots by hand and experimented with the "lapply" command without success.
Here the barplot code
png(file="a_x.png", width=600, height=400)
barplot(table(Example$a_x), main = "a_x")
dev.off()
You can do a simple for loop and paste0 for the file name:
Data
df <- data.frame(a_x = sample(c("Yes","No"), 100, prob = c(0.10,0.90), replace = TRUE),
b_x = sample(c("Yes","No"), 100, prob = c(0.10,0.90), replace = TRUE),
c_x = sample(c("Yes","No"), 100, prob = c(0.10,0.90), replace = TRUE))
Code
for(i in colnames(df)){
png(file = paste0(i,".png"), width = 600, height = 400)
barplot(table(df[i]), main = i)
dev.off()
}

Adding title to simple_visNet

I am using simple_visNet to represent data but I am not able to add a title to my graph. Any idea?
simu_data_mapper <- mapper.sta(dat = data,
filter_values = filt,
num_intervals = 10,
percent_overlap = 70)
simple_visNet(simu_data_mapper, color_filter = FALSE)

making 2 box plots from the same data frame in R

I want to make a 2 box plots with y being weight and x being the before and after. so two different boxplot will be displayed at the same time.
`rats_before = data.frame(
rat_num = paste0(rep("rat number",200),1:200),
weight = rweibull(200,shape= 10,scale = 20))
rats_after = data.frame(
rat_num = paste0(rep("rat number",200),1:200),
weight = rweibull(200,shape= 9,scale = 21))
rats = merge(rats_before,rats_after, by = c("rat_num"))`
i know the next part is not even close but it will give you a idea of what im trying to do.
rat_boxplot = qplot(y = weight, x = (rats_after, rats_before), geom = "boxplot", data = rats)
Or, if you want to do this in base R -
rats_before = data.frame(
rat_num = paste0(rep("rat number",200),1:200),
weight = rweibull(200,shape= 10,scale = 20))
rats_after = data.frame(
rat_num = paste0(rep("rat number",200),1:200),
weight = rweibull(200,shape= 9,scale = 21))
rats <- rbind(rats_before, rats_after)
rats$type <- c(rep("before", nrow(rats_before)), rep("after", nrow(rats_after)))
rats$type <- factor(rats$type)
rats$type <- relevel(rats$type, ref = 2)
boxplot(weight ~ type, data = rats)
You can add a column to each df ans userbind which will bind the rows of the two df instead of merge you can use. Then you simply have to use the aes of a ggplot.
rats_before$condition = "before"
rats_after$condition = "after"
rats = rbind(rats_before,rats_after)
ggplot(rats)+geom_boxplot(aes(condition,weight))
Hope I understood your question.
Tom

How to plot multiple lines in radar chart using split in plotly

I have tried using split trace with scatterpolar and it seems to partly work but can't get it to plot the values for all 10 variables. So I want each row (identified by "ean") be plotted as its own line using the values from X1 to X10.
library(tidyverse)
library(vroom)
library(plotly)
types <- rep(times = 10, list(
col_integer(f = stats::runif,
min = 1,
max = 5)))
products = bind_cols(
tibble(ean = sample.int(1e9, 25)),
tibble(kategori = sample(c("kat1", "kat2", "kat3"), 25, replace = TRUE)),
gen_tbl(25, 10, col_types = types)
)
plot_ly(
products,
type = 'scatterpolar',
mode = "lines+markers",
r = ~X1,
theta = ~"X1",
split = ~ean
)
How can I get plotly to plot all variables in the radarchart (X1-X10)? Usually I would select the columns with X1:X10 but I can't do that here (I think it has to do with that ~ is used to select variable here).
So I want the result to look something like this (but I only show lines and not filled polygons and I would have more products). So in the end 25 products is a lot but I am connecting it so that the user can select the diagrams it wants to show.
In plotly it's convenient to use data in long format - see ?gather.
Please check the following:
library(dplyr)
library(tidyr)
library(vroom)
library(plotly)
types <- rep(times = 10, list(
col_integer(f = stats::runif,
min = 1,
max = 5)))
products = bind_cols(
tibble(ean = sample.int(1e9, 25)),
tibble(kategori = sample(c("kat1", "kat2", "kat3"), 25, replace = TRUE)),
gen_tbl(25, 10, col_types = types)
)
products_long <- gather(products, "key", "value", -ean, -kategori)
plot_ly(
products_long,
type = 'scatterpolar',
mode = "lines+markers",
r = ~value,
theta = ~key,
split = ~ean
)

ddply to ksmooth function

I have a data frame with several columns. the relevant three are chr, pos and ratio. I want to use ddply to ksmooth based on chr (chromosome) but keep getting a wrong data frame with lots of NA values. Here is my reproducible data frame:
d=data.frame(chr=c(rep.int(1,24),rep.int(2,15),rep.int(3,30),rep.int(4,20),rep.int(5,11)),
pos=c(sort(sample(1:1000, size = 24, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 15, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 30, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 20, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 11, replace = FALSE),decreasing = FALSE)),
ratio=seq(1:100))
and ddply function
f <- ddply(d, .(chr),
function(e) {
as.data.frame(ksmooth(e$pos,e$ratio,"normal",bandwidth=10))
})
Obviously I'm doing something wrong.
Thanks for the help,
Guy
This is nothing related to plyr::ddply. The issue is with ksmooth. You want:
ksmooth(e$pos, e$ratio, "normal", bandwidth=10, x.points = e$pos)
Read ?ksmooth for what x.points means. By default, this is NULL, and ksmooth will use n.points instead. This is the source of all your trouble.

Resources