How to change values in a data.frame column into numbers? - r

I have the following (sample) data.frame
x <- data.frame(gene = 1:3, Sample1 = 5:7, Sample2 = 4:6, Sample3 = 6:8)
I want to change the column names and then use the numbers in the new titles as x-axis values for my plot
colnames(x) <- c("Gene", "HeLa_0.2", "HeLa_2.0", "HeLa_5.0")
x_gather <- x %>%
gather(key=treatment, value=values, -c(Gene)) %>%
tidyr::separate(treatment, into=c("Cell_line", "treatment"),sep="_")
ggplot()+
geom_line(x_gather, mapping=aes(treatment, y=values, group=Gene))
But I want the numbers to be spaced on an x-axis like this, instead of on an axis like this (which I get only if I copy my data to excel, format them as numbers, and then load it into R again...)
Any suggestions to how to solve this?
Thanks! :)

All you need to do is make the treatment variable numeric. For instance:
x_gather <- x %>%
gather(key=treatment, value=values, -c(Gene)) %>%
tidyr::separate(treatment, into=c("Cell_line", "treatment"),sep="_") %>%
mutate(treatment = as.numeric(treatment))
ggplot()+
geom_line(x_gather, mapping=aes(treatment, y=values, group=Gene))

Related

Converting basic r barplot to ggplot

I've currently got a barplot that has a few basic parameters. However, I'm looking to try and convert this into ggplot. The extra parameters don't matter too much; the main problem that I'm having is that I'm trying to plot the sum of various columns, but I'm unable to transpose it correctly as t(data) doesn't seem to work. Here's what I've got so far:
## Subset of indicators
indicators <- clean_data[c(8, 12, 14:23)]
## Get sum of columns
indicator_sums <- colSums(indicators, na.rm = TRUE)
### Transpose for ggplot
(empty)
## Make bar plot
barplot(indicator_sums, ylim=range(pretty(c(0, indicator_sums))), cex.axis=0.75,cex.lab=0.8, cex.names=0.7, col='magenta', las=2, ylab = 'Offences Recorded Using Indicator')
You may try
library(dplyr)
library(reshape2)
dummy <- data.frame(
A = c(1:20),
B = rnorm(20, 10, 4),
C = runif(20, 19,30),
D = sample(c(10:40),20, replace = T)
)
barplot(colSums(dummy))
dummy %>%
colSums %>%
melt %>%
rownames_to_column %>%
ggplot(aes(x = rowname, y = value)) +
geom_col()

Plot multiple lists on the same graph in r (scatter plot)

I was trying to plot a graph that looks like the below figure based on the code under it:
xAxisName <- c("ML", "MN")
car1 <- c(5,6)
names(car1) <- xAxisName
car2 <- c(5.5,6.2)
names(car2) <- xAxisName
car3 <- c(4.9, 5.4)
names(car3) <- xAxisName
The plot plots 2 car properties on the x axis and each property has 3 car values. But these are separate lists. How could this plot be plotted?
Get all the 'car' objects into a list, bind them with bind_rows and use ggplot, then pivot to 'long' format and use ggplot
library(ggplot2)
library(dplyr)
library(tidyr)
mget(ls(pattern = '^car\\d+$')) %>%
bind_rows(.id = 'car') %>%
pivot_longer(cols = -car) %>%
ggplot(aes(x = name, y = value, color = car)) +
geom_point()+
scale_y_continuous(expand = c(5, 6))

ggplot for multiple values in the same row

I have a data frame with multiple values in the same row
index price
1 1000,2000,3000
2 2000,500
The data frame has 12 rows and not all price rows have equal length. I want to plot index vs price with index along x-axis and price along y-axis. I have the following code-
ggplot(data_m,
aes(x = 1:12,
y = data_m$price))
I get the error- Error: Aesthetics must be either length 1 or the same as the data (12): y
How do I plot every value in the price column?
Maybe you are looking for this. You have to reshape data and then look for the strategy to plot as mentioned by #TheSciGuy. Here a tidyverse approach using separate_rows() to split values in your rows and then a full_join() to compact with the index you wish. Next the code:
library(tidyverse)
#Data and plot
df %>% separate_rows(price,sep=',') %>%
mutate(price=as.numeric(price)) %>%
full_join(data.frame(index=1:12)) %>%
ggplot(aes(x=factor(index),y=price))+
geom_point()+
xlab('index')
Output:
Some data used:
#Data
df <- structure(list(index = 1:2, price = c("1000,2000,3000", "2000,500"
)), class = "data.frame", row.names = c(NA, -2L))
And if you want some color per index:
#Data and plot 2
df %>% separate_rows(price,sep=',') %>%
mutate(price=as.numeric(price)) %>%
full_join(data.frame(index=1:12)) %>%
ggplot(aes(x=factor(index),y=price,color=factor(index)))+
geom_point()+
xlab('index')+
theme(legend.position = 'none')
Output:

Avoid converting numbers to dates in plotly

I have a matrix that I want to create a heatmap for in plotly. the row names are assays and the colnames are CASRN and they are in this format "131-55-5"
my matrix looks like this
the data matrix for the heatmap
for some reason plotly thinks these are dates and converts them to something like March 2000 and gives me an empty plot.
before i convert my data frame to matrix i checked and all columns are factors.
is there any way I can make sure my numbers wont turn into dates when i plot my matrix?
this is the code i am using for my heatmap
plot_ly(x=colnames(dm_new2), y=rownames(dm_new2), z = dm_new2, type = "heatmap") %>%
layout(margin = list(l=120))
Using some random data to mimic your dataset. Simply put your matrix in a dataframe. Try this:
set.seed(42)
library(plotly)
library(dplyr)
library(tidyr)
dm_new2 <- matrix(runif(12), nrow = 4, dimnames = list(LETTERS[1:4], c("131-55-5", "113-48-4", "1582-09-8")))
# Put matrix in a dataframe
dm_new2 <- as.data.frame(dm_new2) %>%
# rownames to column
mutate(x = row.names(.)) %>%
# convert to long format
pivot_longer(-x, names_to = "y", values_to = "value")
dm_new2 %>%
plot_ly(x = ~x, y = ~y, z = ~value, type = "heatmap") %>%
layout(margin = list(l=120))
Created on 2020-04-08 by the reprex package (v0.3.0)

"Dotplot" visualisation with factors

I am not sure how to approach this. I want to create a "dotpot" style plot in R from a data frame of categorical variables (factors) such that for each column of the df I plot a column of dots, each coloured according to the factors. For example,
my_df <- cbind(c('sheep','sheep','cow','cow','horse'),c('sheep','sheep','sheep','sheep',<NA>),c('sheep','cow','cow','cow','cow'))
I then want to end up with a 3 x 5 grid of dots, each coloured according to sheep/cow/horse (well, one missing because of the NA).
Do you mean something like this:
my_df <- cbind(c('sheep','sheep','cow','cow','horse'),
c('sheep','sheep','sheep','sheep',NA),
c('sheep','cow','cow','cow','cow'))
df <- data.frame(my_df) # make it as data.frame
df$id <- row.names(df) # add an id
library(reshape2)
melt_df <-melt(df,'id') # melt it
library(ggplot2) # now the plot
p <- ggplot(melt_df, aes(x = variable, fill = value))
p + geom_dotplot(stackgroups = TRUE, binwidth = 0.3, binpositions = "all")

Resources