Extracting values from a graph - r

I have a graph that is created by complex numbers from the function below. I would like to extract the resulting data points which correpond with the line from the data plot as to be able to work with a vector of data.
library(multitaper)
NW<-10
K<-5
x<-c(2,3,1,3,4,6,7,8,5,4,3,2,4,5,7,8,6,4,3,2,4,5,7,8,6,4,5,3,2,5,7,8,6,4,5,3,6,7,8,8,9,7,6,5,4,7)
resSpec <- spec.mtm(as.ts(x), k= K, nw=NW, nFFT = length(x),
centreWithSlepians = TRUE, Ftest = TRUE,
jackknife = FALSE, maxAdaptiveIterations = 100,
plot =FALSE, na.action = na.fail)
plot(resSpec)
What would be the best procedure. I have tried saving the plot in emf. I wanted to use package ReadImages which was I believe the right package. (however this was not available for R versiĆ³n 3.02 so I could not use it). What would be the correct procedure of saving and extracting and are there other packages and in what file types could I save the graph (as far as I can see R (OS windows) only permist emf.)
Any help welcomed

Related

get results of principal Component Analysis in R

I want to get the results of PC1 and PC2 to plot courbe of both in the same graph with tableau desktop.
How to do?
data = read.csv(file="data.csv",header=TRUE, sep=";")
data.active <- data[, 1:30]
library(factoextra)
res.pca <- prcomp(data.active,center = TRUE, scale. = TRUE)
fviz_eig(res.pca)
I think you need to write a csv with the results in between R and Tableau. The code for that is written bellow :
# Principal Components Analysis
res.pca <- stats::prcomp(iris[,-5],center = TRUE, scale. = TRUE)
# Choose number of dimension kept
factoextra::fviz_eig(res.pca)
# Some visualisation
factoextra::fviz_pca_var(res.pca)
factoextra::fviz_pca_ind(res.pca)
factoextra::fviz_pca_biplot(res.pca)
# access transformed points
str(res.pca)
res.pca$x
# save points in csv to use outside of R
utils::write.csv(x = res.pca$x, file = "path/data_pca.csv")
# Load your data and do graphs the usual way with tableau
I used ?prcomp to find the data in the result, you may also push further your analysis and use some nice graphics (biplots of individual / variable, clustering, ...) with R (and import only images in Tableau) using : link

How to store results of a simulation and plot all the results in one plot using plot_KDE in Luminescence package for R

I am creating a simulation using random number simulations. This gives 100 sets of 45 values with error.
First I would like to store the results of these simulations.
I then need to plot the results of these simulations on one plot. The plot I need to produce uses the package Luminescence and is of the type KDE.
I have managed to produce the separate entities but am struggling to both store the results and to produce the plot with all the simulations.
So far I have created the simulation:
Simulation <- function() {
RNC <- rescale (SFMT(45, dim=1, mexp=216091,
usepset=T, withtorus= F, usetime=T),
c(0.01,130))
RNC_error <- RNC*0.15
df <-data.frame(RNC,RNC_error)
}
the plot I want to create uses the following:
library("Luminescence")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
For my final result I require the numerical results of all the simulations stored and a single KDE plot with the results of all the simulations.
Split your problem into two parts.
Storing results. You have a data frame, df, so just use write.csv() to store the results to a CSV file, i.e.
write.csv(df, file="some_file.csv")
Storing your plot. Obviously you can't use a csv file, so instead we'll use a pdf or png, e.g.
# Open the file
pdf("figure_file.pdf")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
# Close the file
dev.off()
To save as a png use png() instead of pdf()

Dendrogram and HistDAWass package

I am using the HistDAWass package (https://cran.r-project.org/web/packages/HistDAWass/index.html) to perform clustering using a script partially provided by the package author.
As the Data1.csv files does not include a column with the row name sample (labels) I get a dendrogram that mark the tree labels as I1...I6.
Therefore, I tried to work with a new file (Data2.csv) which its first column include the labels but I get an error.
I will appreciate if someone can explain me how to generate the dendrogram with the new labels.
Script:
library(HistDAWass)
data=read.csv('D:/Data1.csv', header = FALSE)
data=t(data)
Hdata=MatH(nrows=6,ncols = 1)
for (i in 1:get.MatH.nrows(Hdata)){
tmp=data2hist(as.vector(data[,i]))
Hdata#M[i,1][[1]]=tmp
}
results=WH_hclust(x = Hdata,simplify = TRUE, method="complete")
plot(results) # it plots the dendrogram
Data files (in zip):
http://ge.tt/8yVsiQS2/v/0
The script contains a way for generating a matrix, where, in each cell there is a distributionH object. From raw data (for each row of the csv file) a distributionH in the for cycle, a new MatH (a matrix of distributions) is build.
For building the same from Data2.csv file you should run the following script
library(HistDAWass)
#read data
data=read.csv('Data2.csv', header = FALSE)
#initialize an empty MatH matrix using names from the firs colum of data
Hdata=MatH(nrows=nrow(data),rownames=as.list(as.character(data[,1])),ncols = 1)
#Fill the matrix
for (i in 1:get.MatH.nrows(Hdata)){
tmp=data2hist(as.vector(t(data[i,2:ncol(data)])))
Hdata#M[i,1][[1]]=tmp
}
#Do hierarchical clustering
results=WH_hclust(x = Hdata,simplify = TRUE, method="complete")
plot(results) # it plots the dendrogram

'not a graph obect' error when performing degree() from igraph package

Since I need normalized scores, I wanted to call the degree() function on my adjacency matrix I got from a text file I loaded into R using read.delim. That went perfectly fine with the sna package.
When I run
K3_T2_ACAD <- diag.remove(read.delim("K3_T2_ACAD.txt", header = TRUE,
sep = "\t", row.names = 1), remove.val=0)
and then
K3_T2_ACAD_indeg <- degree(K3_T2_ACAD, g=1, nodes=NULL, gmode="digraph",
diag=FALSE, tmaxdev=FALSE, cmode="indegree")
it works!
I tried detaching the sna functions because I thought that was the problem. However, when I run the igraph degree() function, it does not work:
K3_T2_ACAD_indeg2 <- degree(K3_T2_ACAD, mode ="in", loops = FALSE, normalized = TRUE)
returns
Error in degree(K3_T2_ACAD, mode = "in", loops = FALSE, normalized =
TRUE) : Not a graph object
The first column and row each contain the participant codes. Is it possible, that igraph cannot work with that, whereas sna can?
The sna package uses adjacency matrices, igraph does not not. You need to create an igraph object to work on. See e.g. http://igraph.org/r/doc/aaa-igraph-package.html

R programming - Graphic edges too large error while using clustering.plot in EMA package

I'm an R programming beginner and I'm trying to implement the clustering.plot method available in R package EMA. My clustering works fine and I can see the results populated as well. However, when I try to generate a heat map using clustering.plot, it gives me an error "Error in plot.new (): graphic edges too large". My code below,
#Loading library
library(EMA)
library(colonCA)
#Some information about the data
data(colonCA)
summary(colonCA)
class(colonCA) #Expression set
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
#Applying average linkage clustering on colonCA data using Pearson correlation
expr_genes <- genes.selection(expr_mat, thres.num=100)
expr_sample <- clustering(expr_mat[expr_genes,],metric = "pearson",method = "average")
expr_gene <- clustering(data = t(expr_mat[expr_genes,]),metric = "pearson",method = "average")
expr_clust <- clustering.plot(tree = expr_sample,tree.sup=expr_gene,data=expr_mat[expr_genes,],title = "Heat map of clustering",trim.heatmap =1)
I do not get any error when it comes to actually executing the clustering process. Could someone help?
In your example, some of the rownames of expr_mat are very long (max(nchar(rownames(expr_mat)) = 271 characters). The clustering_plot function tries to make a margin large enough for all the names but because the names are so long, there isn't room for anything else.
The really long names seem to have long stretches of periods in them. One way to condense the names of these genes is to replace runs of 2 or more periods with just one, so I would add in this line
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
rownames(expr_mat)<-gsub("\\.{2,}","\\.", rownames(expr_mat))
Then you can run all the other commands and plot like normal.

Resources