How can I make a “color map” plot in idl? - idl-programming-language

I would like to be able to make a graph as produced by the code shown below (but using logarithmic axes). I have a 2D matrix containing the data and I know the separation positions between one cell and the other (equispaced if viewed in logarithmic scale). The code that I report below simulates what I would like obtain but it use Hist_2D and therefore I do not think it is usable in my case.
An example of my data:
data is a Matrix 9*9
data [0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.429 0.143 0.000 0.000 0.048 0.000 0.000 0.000 0.000
0.857 0.810 0.667 0.429 0.429 0.286 0.190 0.286 0.143
0.952 0.952 0.905 0.857 0.857 0.905 0.857 0.762 0.810
1.000 1.000 0.952 0.952 0.952 0.952 0.952 0.952 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000]
x e y are two vector of length 10
x [0.189036 0.484322 0.779609 1.07489 1.37018 1.66547 1.96075 2.25604 2.55133]
y [ -1.06208 -0.584192 -0.106299 0.371593 0.849485 1.32738 1.80527 2.28316 2.76105]
code
PRO Plot2
x = cgScaleVector(Randomn(-3L, 100000)*3., -10, 10)
y = cgScaleVector(Randomn(-5L, 100000)*10., 0, 100)
xrange = [Min(x), Max(x)]
yrange = [Min(y), Max(y)]
xbinsize = 0.25
ybinsize = 3.00
cgDisplay
density = Hist_2D(x, y, Min1=xrange[0], Max1=xrange[1], Bin1=xbinsize, Min2=yrange[0], Max2=yrange[1], Bin2=ybinsize)
maxDensity = Ceil(Max(density)/1e2) * 1e2
scaledDensity = BytScl(density, Min=0, Max=maxDensity)
cgLoadCT, 33
TVLCT, cgColor('gray', /Triple), 0
TVLCT, r, g, b, /Get
palette = [ [r], [g], [b] ]
cgImage, scaledDensity, XRange=xrange, YRange=yrange, /Axes, Palette=palette, $
XTitle='Concentration of X', YTitle='Concentration of Y', $
Position=[0.125, 0.125, 0.9, 0.8]
thick = (!D.Name EQ 'PS') ? 6 : 2
cgContour, density, LEVELS=maxDensity*[0.25, 0.5, 0.75], /OnImage, $
C_Colors=['Tan','Tan', 'Brown'], C_Annotation=['Low', 'Avg', 'High'], $
C_Thick=thick, C_CharThick=thick
cgColorbar, Position=[0.125, 0.875, 0.9, 0.925], Title='Density', $
Range=[0, maxDensity], NColors=254, Bottom=1, OOB_Low='gray', $
TLocation='Top'
END ;*****************************************************************
Plot2
END
Thanks for your help!

In the code you have posted, Hist_2D computes the density map that is then displayed by cgImage. Since you already have the matrix you want to display (data), you can simply run:
cgImage, data, /axes, /scale, /keep, xrange=[0.04,2.70], yrange=[-1.30,3.00]

Related

Binning by equal standard deviation R

I have a vector containing some data, in particular
tau_3[p_3<3]
[1] 7.837 7.813 6.276 8.669 7.001 6.032 6.897 5.967 9.417 8.251 7.892 8.752 9.873 9.461 8.591 7.697 8.372 9.324 9.135 7.807
[21] 10.034 10.701 9.315 6.979 9.843 8.742 8.829 7.406 8.588 6.803 7.462 8.379 8.075 8.294 8.218
which has to be studied with respect to another set of datapoints
>p_3[p_3<3]
[1] 0.020 0.021 0.022 0.023 0.024 0.026 0.028 0.014 0.029 0.030 0.033 0.035 0.037 0.040 0.042 0.044 0.050 0.055 0.060 0.065 0.070 0.075 0.080 0.085
[25] 0.090 0.100 0.110 0.120 0.130 0.150 0.160 0.190 0.200 0.230 0.240
I would like to divide the pressure p_3 data (the subset given above) it in such a way that each bin has, more or less, the same standard deviation for the decay time \tau_3 data that it contains. In particular, I should have a vector containing the breaks for such binned data.
I don't know of any package that could do this and I've been scratching my head on how to do it for hours. If you could give me a solution I would be very grateful.

Importing many files at the same time and adding ID indicator

I have 91 files - .log format:
rajectory Log File
Rock type: 2 (0: Sphere, 1: Cuboid, 2: Rock)
Nr of Trajectories: 91
Trajectory-Mode: ON
Average Slope (Degrees): 28.05 / 51.99 / 64.83
Filename: test_tschamut_Pos1.xml
Z-offset: 1.32000
Rock Position X: 696621.38
Rock Position Y: 167730.02
Rock Position Z: 1679.6400
Friction:
Overall Type: Medium
t (s) x (m) y (m) z (m) p0 () p1 () p2 () p3 () vx (m s-1) vy (m s-1) vz (m s-1) wx (rot s-1) wy (rot s-1) wz (rot s-1) Etot (kJ) Ekin (kJ) Ekintrans (kJ) Ekinrot (kJ) zt (m) Fv (kN) Fh (kN) Slippage (m) mu_s (N s m-1) v_res (m s-1) w_res (rot s-1) JumpH (m) ProjDist (m) Jc () JH_Jc (m) SD (m)
0.000 696621.380 167730.020 1680.960 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1192.526 0.000 0.000 0.000 1677.754 0.000 0.000 0.000 0.350 0.000 0.000 3.206 0.000 0.000 0.000 0.000
0.010 696621.380 167730.020 1680.959 1.000 0.000 -0.000 0.000 0.000 0.000 -0.098 0.000 0.000 0.000 1192.526 0.010 0.010 0.000 1677.754 0.000 0.000 0.000 0.350 0.098 0.000 3.205 0.000 0.000 0.000 0.000
0.020 696621.380 167730.020 1680.958 1.000 0.000 -0.000 0.000 0.000 0.000 -0.196 0.000 0.000 0.000 1192.526 0.039 0.039 0.000 1677.754 0.000 0.000 0.000 0.350 0.196 0.000 3.204 0.000 0.000 0.000 0.000
0.040 696621.380 167730.020 1680.952 1.000 0.000 -0.000 0.000 0.000 0.000 -0.392 0.000 0.000 0.000 1192.526 0.158 0.158 0.000 1677.754 0.000 0.000 0.000 0.350 0.392 0.000 3.198 0.000 0.000 0.000 0.000
0.060 696621.380 167730.020 1680.942 1.000 0.000 -0.000 0.000 0.000 0.000 -0.589 0.000 0.000 0.000 1192.526 0.355 0.355 0.000 1677.754 0.000 0.000 0.000 0.350 0.589 0.000 3.188 0.000 0.000 0.000 0.000
I have managed to import one single file, and to retain only the desired variables which are: x, y, z, Etot:
trjct <- read.table('trajectory_test_tschamut_Pos1.log', skip = 23)
trjct <- trjct[,c("V1","V2","V3", "V4", "V15")]
colnames(trjct) <- c("t", "x", "y", "z", "Etot")
> str(trjct)
'data.frame': 1149 obs. of 5 variables:
$ t : num 0 0.01 0.02 0.04 0.06 0.08 0.11 0.13 0.15 0.16 ...
$ x : num 696621 696621 696621 696621 696621 ...
$ y : num 167730 167730 167730 167730 167730 ...
$ z : num 1681 1681 1681 1681 1681 ...
$ Etot: num 1193 1193 1193 1193 1193 ...
However I have 91 of these files and would like to analyse them simultaneously. Therefore, I want to create one large dataset, that distingishes the data from every file by adding an ID - similiar question has been answered here.
I have applied the code to my data and needs and adjusted it here and there, but I always keep getting some errors.
# importing all files at the same time
files.list <- list.files(pattern = ".log")
trjct <- data.frame(t=numeric(),
x=numeric(),
z=numeric(),
Etot=numeric(),
stringsAsFactors=FALSE)
for (i in 1: length(files.list)) {
df.next <- read.table(files.list[[i]], header=F, skip = 23)
df.next$ID <- paste0('simu', i)
df <- rbind(df, df.next)
}
I am getting an error:
Error in rep(xi, length.out = nvar) :
attempt to replicate an object of type 'closure'
QUESTIONS:
Where is the problem and how can I fix it?
Is there a better solution?
You could also check out purrr::map_df which behaves like lapply or for loop but returns a data.frame
read_traj <- function(fi) {
df <- read.table(fi, header=F, skip=23)
df <- df[, c(1:4, 15)]
colnames(df) <- c("t", "x", "y", "z", "Etot")
return(df)
}
files.list <- list.files(pattern = ".log")
library(tidyverse)
map_df has a handy feature .id=... that creates a column, id, with numbers 1...N where N is number of files.
map_df(files.list, ~read_traj(.x), .id="id")
If you want to save the file name instead, use the id column to access files.list
map_df(files.list, ~read_traj(.x), .id="id") %>%
mutate(id = files.list[as.numeric(id)])
First of all, you should encapsulate the reading part in a function :
read_log_file <- function(path) {
trjct <- read.table(path, skip = 23)
trjct <- trjct[,c("V1","V2","V3", "V4", "V15")]
colnames(trjct) <- c("t", "x", "y", "z", "Etot")
return(trjct)
}
Then, you can create a list of data.frame using mapply (kind of apply which can take two parameters, go to datacamp article on apply family if you want to know more).
files.list <- list.files(pattern = ".log")
ids <- 1:length(files.list)
df_list = mapply(function(path, id) {
df = read_log_file(path)
df$ID = id
return(df)
}, files.list, ids, SIMPLIFY=FALSE)
Note the SIMPLIFY=FALSE part, it avoids mapply to return a kind of data.frame and return a raw list of data.frame instead.
Finally, you can concatenate all your data.frame in one with bind_rows from dplyr package :
df = dplyr::bind_rows(df_list)
Note : in general, in R, it's better to use *apply functions family.

Convert column headers into new columns

My data frame consists of time series financial data from many public companies. I purposely set companies' weights as their column headers while cleaning the data, and I also calculated log returns for each of them in order to calculate weighted returns in the next step.
Here is an example. There are four companies: A, B, C and D, and their corresponding weights in the portfolio are 0.4, 0.3, 0.2, 0.1 separately. So the current data set looks like:
df1 <- data.frame(matrix(vector(),ncol=9, nrow = 4))
colnames(df1) <- c("Date","0.4","0.4.Log","0.3","0.3.Log","0.2","0.2.Log","0.1","0.1.Log")
df1[1,] <- c("2004-10-29","103.238","0","131.149","0","99.913","0","104.254","0")
df1[2,] <- c("2004-11-30","104.821","0.015","138.989","0.058","99.872","0.000","103.997","-0.002")
df1[3,] <- c("2004-12-31","105.141","0.003","137.266","-0.012","99.993","0.001","104.025","0.000")
df1[4,] <- c("2005-01-31","107.682","0.024","137.08","-0.001","99.782","-0.002","105.287","0.012")
df1
Date 0.4 0.4.Log 0.3 0.3.Log 0.2 0.2.Log 0.1 0.1.Log
1 2004-10-29 103.238 0 131.149 0 99.913 0 104.254 0
2 2004-11-30 104.821 0.015 138.989 0.058 99.872 0.000 103.997 -0.002
3 2004-12-31 105.141 0.003 137.266 -0.012 99.993 0.001 104.025 0.000
4 2005-01-31 107.682 0.024 137.08 -0.001 99.782 -0.002 105.287 0.012
I want to create new columns that contain company weights so that I can calculate weighted returns in my next step:
Date 0.4 0.4.W 0.4.Log 0.3 0.3.W 0.3.Log 0.2 0.2.W 0.2.Log 0.1 0.1.W 0.1.Log
1 2004-10-29 103.238 0.400 0.000 131.149 0.300 0.000 99.913 0.200 0.000 104.254 0.100 0.000
2 2004-11-30 104.821 0.400 0.015 138.989 0.300 0.058 99.872 0.200 0.000 103.997 0.100 -0.002
3 2004-12-31 105.141 0.400 0.003 137.266 0.300 -0.012 99.993 0.200 0.001 104.025 0.100 0.000
4 2005-01-31 107.682 0.400 0.024 137.080 0.300 -0.001 99.782 0.200 -0.002 105.287 0.100 0.012
We can try
v1 <- grep("^[0-9.]+$", names(df1), value = TRUE)
df1[paste0(v1, ".w")] <- as.list(as.numeric(v1))

Export PCA components in r

I did pca on my data using r and I am trying to save the components with an eigenvalue larger than 1.
> summary(pca1)
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
Standard deviation 1.2851803 1.1245020 1.0737268 1.0011978 0.9841687 0.88758402 0.84798807 0.67308490
Proportion of Variance 0.2064611 0.1580631 0.1441112 0.1252996 0.1210735 0.09847567 0.08988547 0.05663041
Cumulative Proportion 0.2064611 0.3645241 0.5086353 0.6339349 0.7550084 0.85348412 0.94336959 1.00000000
> loadings(pca1)
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
AspectRatio 0.604 0.325 0.230 0.194 0.652
CPUSpeed 0.241 0.278 0.890 -0.242
IsDvrEnabled 0.428 -0.329 -0.109 -0.290 -0.724 -0.281
ZoomMode 0.123 0.837 -0.133 -0.232 -0.124 -0.432
Tuner_BitRate 0.600 -0.272 0.392 0.161 -0.616
Tuner_Hole -0.948 0.306
Receiver_VideoDecoderErrors -0.705 0.283 -0.640
Receiver_AudioDecoderErrors -0.128 -0.690 -0.275 0.650
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Proportion Var 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
Cumulative Var 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000
So in this case I am interested in the first four components.
Is there a way that I can save it in a table or a file (file is proffered).
Thank you!
loadings(pca1) returns the PCA Loadings. unclass drops the class and converts it into a matrix.
pca1$sdev^2 > 1 returns TRUE for columns where the eigenvalue > 1. [...,drop = F] selects the columns where the index is equals to TRUE while keeping the matrix structure even when only one column is selected. write.csv writes the results to a file.
Final Code: write.csv(x = unclass(loadings(pca1))[,(pca1$sdev^2 > 1),drop = FALSE], file = "filename.csv")

Reading a symmetric matrix from file that omits upper triangular part

Using R, what is the best way to read a symmetric matrix from a file that omits the upper triangular part. For example,
1.000
.505 1.000
.569 .422 1.000
.602 .467 .926 1.000
.621 .482 .877 .874 1.000
.603 .450 .878 .894 .937 1.000
I have tried read.table, but haven't been successful.
Here's a read.table and loopless and *apply-less solution:
txt <- "1.000
.505 1.000
.569 .422 1.000
.602 .467 .926 1.000
.621 .482 .877 .874 1.000
.603 .450 .878 .894 .937 1.000"
# Could use clipboard or read this from a file as well.
mat <- data.matrix( read.table(text=txt, fill=TRUE, col.names=paste("V", 1:6)) )
mat[upper.tri(mat)] <- t(mat)[upper.tri(mat)]
> mat
V1 V2 V3 V4 V5 V6
[1,] 1.000 0.505 0.569 0.602 0.621 0.603
[2,] 0.505 1.000 0.422 0.467 0.482 0.450
[3,] 0.569 0.422 1.000 0.926 0.877 0.878
[4,] 0.602 0.467 0.926 1.000 0.874 0.894
[5,] 0.621 0.482 0.877 0.874 1.000 0.937
[6,] 0.603 0.450 0.878 0.894 0.937 1.000
I copied your text, and then used tt <- file('clipboard','rt') to import it. For a standard file:
tt <- file("yourfile.txt",'rt')
a <- readLines(tt)
b <- strsplit(a," ") #insert delimiter here; can use regex
b <- lapply(b,function(x) {
x <- as.numeric(x)
length(x) <- max(unlist(lapply(b,length)));
return(x)
})
b <- do.call(rbind,b)
b[is.na(b)] <- 0
#kinda kludgy way to get the symmetric matrix
b <- b + t(b) - diag(b[1,1],nrow=dim(b)[1],ncol=dim(b)[2]
I'm posting but I like Blue Magister's approach wat better. But maybe there's something in this that's of use.
mat <- readLines(n=6)
1.000
.505 1.000
.569 .422 1.000
.602 .467 .926 1.000
.621 .482 .877 .874 1.000
.603 .450 .878 .894 .937 1.000
nmat <- lapply(mat, function(x) unlist(strsplit(x, "\\s+")))
lens <- sapply(nmat, length)
dlen <- max(lens) -lens
bmat <- lapply(seq_along(nmat), function(i) {
as.numeric(c(nmat[[i]], rep(NA, dlen[i])))
})
mat <- do.call(rbind, bmat)
mat[upper.tri(mat)] <- t(mat)[upper.tri(mat)]
mat
Here is an approach which also works if the dimensions of the matrix are unknown.
# read file as a vector
mat <- scan("file.txt", what = numeric())
# calculate the number of columns (and rows)
ncol <- (sqrt(8 * length(mat) + 1) - 1) / 2
# index of the diagonal values
diag_idx <- cumsum(seq.int(ncol))
# generate split index
split_idx <- cummax(sequence(seq.int(ncol)))
split_idx[diag_idx] <- split_idx[diag_idx] - 1
# split vector into list of rows
splitted_rows <- split(mat, f = split_idx)
# generate matrix
mat_full <- suppressWarnings(do.call(rbind, splitted_rows))
mat_full[upper.tri(mat_full)] <- t(mat_full)[upper.tri(mat_full)]
[,1] [,2] [,3] [,4] [,5] [,6]
0 1.000 0.505 0.569 0.602 0.621 0.603
1 0.505 1.000 0.422 0.467 0.482 0.450
2 0.569 0.422 1.000 0.926 0.877 0.878
3 0.602 0.467 0.926 1.000 0.874 0.894
4 0.621 0.482 0.877 0.874 1.000 0.937
5 0.603 0.450 0.878 0.894 0.937 1.000
This won't work in the OP's case because the diagonal was 1, but if the diagonal is zero or missing, then you can use as.dist%>%as.matrix to copy the lower diagonal to the upper diagonal and set the diagonal to zero:
input=" Pop0 Pop1 Pop2
Pop0
Pop1 0.015
Pop2 0.079 0.083
Pop3 0.014 0.016 0.073"
as.matrix(as.dist(cbind(read.table(text=input,fill=T),NA)))
Result:
Pop0 Pop1 Pop2 Pop3
Pop0 0.000 0.015 0.079 0.014
Pop1 0.015 0.000 0.083 0.016
Pop2 0.079 0.083 0.000 0.073
Pop3 0.014 0.016 0.073 0.000
In my case the input had column names, so read.table(fill=T) was automatically able to determine the number of columns and IRTFM's trick of specifying col.names=1:4 was not neeeded.

Resources