corr.test arguments imply differing number of rows - r

I have seen this error multiple times in different projects and I was wondering if there is a way to tell which line caused the error in general?
My specific case:
http://archive.ics.uci.edu/ml/machine-learning-databases/00275/
#using the bike.csv
data<-read.csv("PATH_HERE\\Bike-Sharing-Dataset\\day.csv",header=TRUE)
require(psych)
corr.test(data)
data<-data[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
"season","temp","weathersit","weekday","windspeed","workingday","yr")]
data[data=='']<-NA
#View(data)
require(psych)
cors<-corr.test(data)
returns the error:
Error in data.frame(lower = lower, r = r[lower.tri(r)], upper = upper, :
arguments imply differing number of rows: 0, 91

It works for me
> #using the bike.csv
> data <- read.csv("day.csv",header=TRUE)
> require(psych)
> corr.test(data)
Error in cor(x, use = use, method = method) : 'x' must be numeric
> data <- data[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
+ "season","temp","weathersit","weekday","windspeed","workingday","yr")]
> data[data==''] <- NA
> #View(data)
>
> require(psych)
> cors <- corr.test(data)
> cors
Call:corr.test(x = data)
Correlation matrix
atemp casual cnt holiday hum mnth registered season temp
atemp 1.00 0.54 0.63 -0.03 0.14 0.23 0.54 0.34 0.99
casual 0.54 1.00 0.67 0.05 -0.08 0.12 0.40 0.21 0.54
cnt 0.63 0.67 1.00 -0.07 -0.10 0.28 0.95 0.41 0.63
holiday -0.03 0.05 -0.07 1.00 -0.02 0.02 -0.11 -0.01 -0.03
hum 0.14 -0.08 -0.10 -0.02 1.00 0.22 -0.09 0.21 0.13
mnth 0.23 0.12 0.28 0.02 0.22 1.00 0.29 0.83 0.22
registered 0.54 0.40 0.95 -0.11 -0.09 0.29 1.00 0.41 0.54
season 0.34 0.21 0.41 -0.01 0.21 0.83 0.41 1.00 0.33
temp 0.99 0.54 0.63 -0.03 0.13 0.22 0.54 0.33 1.00
weathersit -0.12 -0.25 -0.30 -0.03 0.59 0.04 -0.26 0.02 -0.12
weekday -0.01 0.06 0.07 -0.10 -0.05 0.01 0.06 0.00 0.00
windspeed -0.18 -0.17 -0.23 0.01 -0.25 -0.21 -0.22 -0.23 -0.16
workingday 0.05 -0.52 0.06 -0.25 0.02 -0.01 0.30 0.01 0.05
yr 0.05 0.25 0.57 0.01 -0.11 0.00 0.59 0.00 0.05
weathersit weekday windspeed workingday yr
atemp -0.12 -0.01 -0.18 0.05 0.05
casual -0.25 0.06 -0.17 -0.52 0.25
cnt -0.30 0.07 -0.23 0.06 0.57
holiday -0.03 -0.10 0.01 -0.25 0.01
hum 0.59 -0.05 -0.25 0.02 -0.11
mnth 0.04 0.01 -0.21 -0.01 0.00
registered -0.26 0.06 -0.22 0.30 0.59
season 0.02 0.00 -0.23 0.01 0.00
temp -0.12 0.00 -0.16 0.05 0.05
weathersit 1.00 0.03 0.04 0.06 -0.05
weekday 0.03 1.00 0.01 0.04 -0.01
windspeed 0.04 0.01 1.00 -0.02 -0.01
workingday 0.06 0.04 -0.02 1.00 0.00
yr -0.05 -0.01 -0.01 0.00 1.00
Sample Size
[1] 731
Probability values (Entries above the diagonal are adjusted for multiple tests.)
atemp casual cnt holiday hum mnth registered season temp
atemp 0.00 0.00 0.00 1.00 0.01 0.00 0.00 0.00 0.00
casual 0.00 0.00 0.00 1.00 1.00 0.04 0.00 0.00 0.00
cnt 0.00 0.00 0.00 1.00 0.28 0.00 0.00 0.00 0.00
holiday 0.38 0.14 0.06 0.00 1.00 1.00 0.15 1.00 1.00
hum 0.00 0.04 0.01 0.67 0.00 0.00 0.58 0.00 0.03
mnth 0.00 0.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00
registered 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00
season 0.00 0.00 0.00 0.78 0.00 0.00 0.00 0.00 0.00
temp 0.00 0.00 0.00 0.44 0.00 0.00 0.00 0.00 0.00
weathersit 0.00 0.00 0.00 0.35 0.00 0.24 0.00 0.60 0.00
weekday 0.84 0.11 0.07 0.01 0.16 0.80 0.12 0.93 1.00
windspeed 0.00 0.00 0.00 0.87 0.00 0.00 0.00 0.00 0.00
workingday 0.16 0.00 0.10 0.00 0.51 0.87 0.00 0.74 0.15
yr 0.21 0.00 0.00 0.83 0.00 0.96 0.00 0.96 0.20
weathersit weekday windspeed workingday yr
atemp 0.05 1.00 0.00 1.00 1.00
casual 0.00 1.00 0.00 0.00 0.00
cnt 0.00 1.00 0.00 1.00 0.00
holiday 1.00 0.25 1.00 0.00 1.00
hum 0.00 1.00 0.00 1.00 0.13
mnth 1.00 1.00 0.00 1.00 1.00
registered 0.00 1.00 0.00 0.00 0.00
season 1.00 1.00 0.00 1.00 1.00
temp 0.05 1.00 0.00 1.00 1.00
weathersit 0.00 1.00 1.00 1.00 1.00
weekday 0.40 0.00 1.00 1.00 1.00
windspeed 0.29 0.70 0.00 1.00 1.00
workingday 0.10 0.33 0.61 0.00 1.00
yr 0.19 0.88 0.75 0.96 0.00
To see confidence intervals of the correlations, print with the short=FALSE option
>

It works for me:::
rm(list=ls())
# http://archive.ics.uci.edu/ml/machine-learning-databases/00275/
#using the bike.csv
day <- read.csv("Bike-Sharing-Dataset//day.csv")
require(psych)
day<-day[,c("atemp","casual","cnt","holiday","hum","mnth","registered",
"season","temp","weathersit","weekday","windspeed","workingday","yr")]
day[day=='']<-NA
require(psych)
corr.test(day)
# corr.test(day)
# Call:corr.test(x = day)
# Correlation matrix
# atemp casual cnt holiday hum mnth registered season temp weathersit weekday windspeed workingday yr
# atemp 1.00 0.54 0.63 -0.03 0.14 0.23 0.54 0.34 0.99 -0.12 -0.01 -0.18 0.05 0.05
# casual 0.54 1.00 0.67 0.05 -0.08 0.12 0.40 0.21 0.54 -0.25 0.06 -0.17 -0.52 0.25
# cnt 0.63 0.67 1.00 -0.07 -0.10 0.28 0.95 0.41 0.63 -0.30 0.07 -0.23 0.06 0.57
# holiday -0.03 0.05 -0.07 1.00 -0.02 0.02 -0.11 -0.01 -0.03 -0.03 -0.10 0.01 -0.25 0.01
# hum 0.14 -0.08 -0.10 -0.02 1.00 0.22 -0.09 0.21 0.13 0.59 -0.05 -0.25 0.02 -0.11
# mnth 0.23 0.12 0.28 0.02 0.22 1.00 0.29 0.83 0.22 0.04 0.01 -0.21 -0.01 0.00
# registered 0.54 0.40 0.95 -0.11 -0.09 0.29 1.00 0.41 0.54 -0.26 0.06 -0.22 0.30 0.59
# season 0.34 0.21 0.41 -0.01 0.21 0.83 0.41 1.00 0.33 0.02 0.00 -0.23 0.01 0.00
# temp 0.99 0.54 0.63 -0.03 0.13 0.22 0.54 0.33 1.00 -0.12 0.00 -0.16 0.05 0.05
# weathersit -0.12 -0.25 -0.30 -0.03 0.59 0.04 -0.26 0.02 -0.12 1.00 0.03 0.04 0.06 -0.05
# weekday -0.01 0.06 0.07 -0.10 -0.05 0.01 0.06 0.00 0.00 0.03 1.00 0.01 0.04 -0.01
# windspeed -0.18 -0.17 -0.23 0.01 -0.25 -0.21 -0.22 -0.23 -0.16 0.04 0.01 1.00 -0.02 -0.01
# workingday 0.05 -0.52 0.06 -0.25 0.02 -0.01 0.30 0.01 0.05 0.06 0.04 -0.02 1.00 0.00
# yr 0.05 0.25 0.57 0.01 -0.11 0.00 0.59 0.00 0.05 -0.05 -0.01 -0.01 0.00 1.00
# Sample Size
# [1] 731
# Probability values (Entries above the diagonal are adjusted for multiple tests.)
# atemp casual cnt holiday hum mnth registered season temp weathersit weekday windspeed workingday yr
# atemp 0.00 0.00 0.00 1.00 0.01 0.00 0.00 0.00 0.00 0.05 1.00 0.00 1.00 1.00
# casual 0.00 0.00 0.00 1.00 1.00 0.04 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00
# cnt 0.00 0.00 0.00 1.00 0.28 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
# holiday 0.38 0.14 0.06 0.00 1.00 1.00 0.15 1.00 1.00 1.00 0.25 1.00 0.00 1.00
# hum 0.00 0.04 0.01 0.67 0.00 0.00 0.58 0.00 0.03 0.00 1.00 0.00 1.00 0.13
# mnth 0.00 0.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 1.00
# registered 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00
# season 0.00 0.00 0.00 0.78 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 1.00 1.00
# temp 0.00 0.00 0.00 0.44 0.00 0.00 0.00 0.00 0.00 0.05 1.00 0.00 1.00 1.00
# weathersit 0.00 0.00 0.00 0.35 0.00 0.24 0.00 0.60 0.00 0.00 1.00 1.00 1.00 1.00
# weekday 0.84 0.11 0.07 0.01 0.16 0.80 0.12 0.93 1.00 0.40 0.00 1.00 1.00 1.00
# windspeed 0.00 0.00 0.00 0.87 0.00 0.00 0.00 0.00 0.00 0.29 0.70 0.00 1.00 1.00
# workingday 0.16 0.00 0.10 0.00 0.51 0.87 0.00 0.74 0.15 0.10 0.33 0.61 0.00 1.00
# yr 0.21 0.00 0.00 0.83 0.00 0.96 0.00 0.96 0.20 0.19 0.88 0.75 0.96 0.00
#
# To see confidence intervals of the correlations, print with the short=FALSE option
cheers

Related

Create data frame from EFA output in R

I am working on EFA and would like to customize my tables. There is a function, psych.print to suppress factor loadings of a certain value to make the table easier to read. When I run this function, it produces this data and the summary stats in the console (in an .RMD document, it produces console text and a separate data frame of the factor loadings with loadings suppressed). However, if I attempt to save this as an object, it does not keep this data.
Here is an example:
library(psych)
bfi_data=bfi
bfi_data=bfi_data[complete.cases(bfi_data),]
bfi_cor <- cor(bfi_data)
factors_data <- fa(r = bfi_cor, nfactors = 6)
print.psych(fa_ml_oblimin_2, cut=.32, sort="TRUE")
In an R script, it produces this:
item MR2 MR3 MR1 MR5 MR4 MR6 h2 u2 com
N2 17 0.83 0.654 0.35 1.0
N1 16 0.82 0.666 0.33 1.1
N3 18 0.69 0.549 0.45 1.1
N5 20 0.47 0.376 0.62 2.2
N4 19 0.44 0.43 0.506 0.49 2.4
C4 9 -0.67 0.555 0.45 1.3
C2 7 0.66 0.475 0.53 1.4
C5 10 -0.56 0.433 0.57 1.4
C3 8 0.56 0.317 0.68 1.1
C1 6 0.54 0.344 0.66 1.3
In R Markdown, it produces this:
How can I save that data.frame as an object?
Looking at the str of the object it doesn't look that what you want is built-in. An ugly way would be to use capture.output and try to convert the character vector to dataframe using string manipulation. Else since the data is being displayed it means that the data is present somewhere in the object itself. I could find out vectors of same length which can be combined to form the dataframe.
loadings <- unclass(factors_data$loadings)
h2 <- factors_data$communalities
#There is also factors_data$communality which has same values
u2 <- factors_data$uniquenesses
com <- factors_data$complexity
data <- cbind(loadings, h2, u2, com)
data
This returns :
# MR2 MR3 MR1 MR5 MR4 MR6 h2 u2 com
#A1 0.11 0.07 -0.07 -0.56 -0.01 0.35 0.38 0.62 1.85
#A2 0.03 0.09 -0.08 0.64 0.01 -0.06 0.47 0.53 1.09
#A3 -0.04 0.04 -0.10 0.60 0.07 0.16 0.51 0.49 1.26
#A4 -0.07 0.19 -0.07 0.41 -0.13 0.13 0.29 0.71 2.05
#A5 -0.17 0.01 -0.16 0.47 0.10 0.22 0.47 0.53 2.11
#C1 0.05 0.54 0.08 -0.02 0.19 0.05 0.34 0.66 1.32
#C2 0.09 0.66 0.17 0.06 0.08 0.16 0.47 0.53 1.36
#C3 0.00 0.56 0.07 0.07 -0.04 0.05 0.32 0.68 1.09
#C4 0.07 -0.67 0.10 -0.01 0.02 0.25 0.55 0.45 1.35
#C5 0.15 -0.56 0.17 0.02 0.10 0.01 0.43 0.57 1.41
#E1 -0.14 0.09 0.61 -0.14 -0.08 0.09 0.41 0.59 1.34
#E2 0.06 -0.03 0.68 -0.07 -0.08 -0.01 0.56 0.44 1.07
#E3 0.02 0.01 -0.32 0.17 0.38 0.28 0.51 0.49 3.28
#E4 -0.07 0.03 -0.49 0.25 0.00 0.31 0.56 0.44 2.26
#E5 0.16 0.27 -0.39 0.07 0.24 0.04 0.41 0.59 3.01
#N1 0.82 -0.01 -0.09 -0.09 -0.03 0.02 0.67 0.33 1.05
#N2 0.83 0.02 -0.07 -0.07 0.01 -0.07 0.65 0.35 1.04
#N3 0.69 -0.03 0.13 0.09 0.02 0.06 0.55 0.45 1.12
#N4 0.44 -0.14 0.43 0.09 0.10 0.01 0.51 0.49 2.41
#N5 0.47 -0.01 0.21 0.21 -0.17 0.09 0.38 0.62 2.23
#O1 -0.05 0.07 -0.01 -0.04 0.57 0.09 0.36 0.64 1.11
#O2 0.12 -0.09 0.01 0.12 -0.43 0.28 0.30 0.70 2.20
#O3 0.01 0.00 -0.10 0.05 0.65 0.04 0.48 0.52 1.06
#O4 0.10 -0.05 0.34 0.15 0.37 -0.04 0.24 0.76 2.55
#O5 0.04 -0.04 -0.02 -0.01 -0.50 0.30 0.33 0.67 1.67
#gender 0.20 0.09 -0.12 0.33 -0.21 -0.15 0.18 0.82 3.58
#education -0.03 0.01 0.05 0.11 0.12 -0.22 0.07 0.93 2.17
#age -0.06 0.07 -0.02 0.16 0.03 -0.26 0.10 0.90 2.05
Ronak Shaw answered my question above, and I used his answer to help create the following function, which nearly reproduces the psych.print data.frame of fa.sort output
fa_table <- function(x, cut) {
#get sorted loadings
loadings <- fa.sort(fa_ml_oblimin)$loadings %>% round(3)
#cut loadings
loadings[loadings < cut] <- ""
#get additional info
add_info <- cbind(x$communalities,
x$uniquenesses,
x$complexity) %>%
as.data.frame() %>%
rename("commonality" = V1,
"uniqueness" = V2,
"complexity" = V3) %>%
rownames_to_column("item")
#build table
loadings %>%
unclass() %>%
as.data.frame() %>%
rownames_to_column("item") %>%
left_join(add_info) %>%
mutate(across(where(is.numeric), round, 3))
}

Minimizing weighted sum of matrix while ensuring distribution of outcomes in R

I am struggeling with an optimization problem involving a simple matrix operation. The task is the following: I have a sqare matrix D containing "damage multipliers" stemming from a prodcuction reduction in producing countries (columns) and felt by "receiving" countries (rows).
AUT BEL BGR CYP CZE DEU DNK ESP EST FIN FRA GBR GRC HRV HUN IRL ITA LTU LUX LVA MLT NLD POL PRT ROU SVK SVN SWE
AUT 1.48 0.15 0.18 0.08 0.19 0.22 0.01 0.01 0.02 0.02 0.05 0.01 0.01 0.02 0.14 0.00 0.02 0.03 0.02 0.02 0.00 0.04 0.10 0.09 0.11 0.16 0.17 0.11
BEL 0.03 2.70 0.34 0.09 0.05 0.03 0.02 0.01 0.04 0.09 0.09 0.02 0.01 0.01 0.03 0.01 0.01 0.03 0.08 0.02 0.00 0.04 0.03 0.37 0.09 0.07 0.15 0.29
BGR 0.01 0.02 9.81 0.09 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.06 0.00 0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.02 0.12 0.01 0.00 0.01
CYP 0.00 0.01 0.01 9.87 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
CZE 0.19 0.11 0.08 0.07 4.14 0.27 0.01 0.00 0.01 0.01 0.03 0.01 0.00 0.00 0.05 0.00 0.03 0.05 0.01 0.01 0.00 0.02 0.32 0.07 0.03 2.57 0.05 0.05
DEU 0.29 2.54 0.27 0.15 0.19 1.71 0.10 0.04 0.06 0.22 0.22 0.09 0.03 0.02 0.11 0.03 0.08 0.12 0.08 0.07 0.00 0.28 0.28 0.55 0.25 0.26 0.11 1.09
DNK 0.01 0.09 0.02 0.09 0.01 0.14 3.43 0.00 0.02 0.12 0.02 0.02 0.00 0.00 0.01 0.00 0.01 0.02 0.01 0.02 0.00 0.01 0.03 0.05 0.01 0.01 0.01 1.39
ESP 0.02 0.26 0.06 0.05 0.02 0.03 0.02 2.72 0.45 0.04 0.22 0.05 0.04 0.01 0.01 0.05 0.06 0.02 0.01 0.01 0.00 0.02 0.03 1.28 0.05 0.02 0.01 0.32
EST 0.00 0.01 0.00 0.03 0.00 0.00 0.00 0.00 5.03 0.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.00 0.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05
FIN 0.01 0.09 0.02 0.03 0.01 0.01 0.06 0.00 0.21 5.48 0.01 0.01 0.00 0.00 0.00 0.01 0.00 0.02 0.01 0.02 0.00 0.01 0.02 0.05 0.01 0.01 0.00 1.99
FRA 0.04 0.89 0.11 0.13 0.03 0.08 0.03 0.18 0.04 0.08 5.19 0.05 0.02 0.01 0.03 0.05 0.06 0.06 0.03 0.03 0.00 0.14 0.04 0.54 0.08 0.04 0.03 0.79
GBR 0.03 0.80 0.09 2.13 0.03 0.05 0.12 0.08 0.03 0.30 0.15 3.13 0.02 0.01 0.02 0.41 0.02 0.12 0.02 0.05 0.00 0.19 0.06 0.36 0.05 0.04 0.02 2.28
GRC 0.00 0.04 0.14 0.26 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.00 2.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.03 0.03 0.00 0.00 0.02
HRV 0.19 0.01 0.01 0.03 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.25 0.03 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.00 0.09 0.01
HUN 0.29 0.07 0.08 0.17 0.30 0.08 0.02 0.00 0.01 0.01 0.06 0.00 0.00 0.01 4.83 0.00 0.01 0.09 0.01 0.05 0.00 0.01 0.05 0.04 0.13 0.23 0.06 0.04
IRL 0.00 0.03 0.01 0.06 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.03 0.00 0.00 0.00 1.80 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.02 0.00 0.00 0.00 0.03
ITA 0.76 0.46 0.40 0.20 0.06 0.24 0.02 0.18 0.04 0.05 0.19 0.03 0.14 0.06 0.06 0.06 4.16 0.05 0.02 0.07 0.00 0.14 0.05 0.37 0.15 0.08 0.21 0.34
LTU 0.00 0.02 0.01 0.01 0.00 0.00 0.00 0.00 0.02 0.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.18 0.00 0.03 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.02
LUX 0.00 0.14 0.00 0.02 0.00 0.02 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.04 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.01 0.01
LVA 0.00 0.01 0.00 0.03 0.00 0.00 0.00 0.00 0.05 0.15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.16 0.00 6.77 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.03
MLT 0.00 0.00 0.00 0.09 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.67 0.00 0.00 0.00 0.00 0.00 0.00 0.01
NLD 0.02 0.86 0.07 0.08 0.02 0.04 0.03 0.01 0.03 0.11 0.08 0.03 0.01 0.01 0.02 0.05 0.01 0.07 0.03 0.02 0.00 2.03 0.03 0.23 0.04 0.03 0.02 0.43
POL 0.02 0.09 0.03 0.19 0.16 0.13 0.01 0.01 0.01 0.02 0.06 0.01 0.00 0.00 0.02 0.00 0.01 0.33 0.00 0.03 0.00 0.01 2.18 0.05 0.02 0.11 0.01 0.11
PRT 0.00 0.05 0.01 0.10 0.00 0.03 0.01 0.07 0.02 0.01 0.04 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.53 0.01 0.00 0.00 0.07
ROU 0.04 0.06 0.89 0.13 0.04 0.02 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.00 0.31 0.00 0.02 0.02 0.00 0.01 0.00 0.00 0.03 0.04 10.52 0.06 0.01 0.03
SVK 0.23 0.04 0.02 0.08 1.12 0.60 0.00 0.00 0.00 0.01 0.32 0.00 0.00 0.00 0.11 0.00 0.00 0.07 0.00 0.02 0.00 0.00 0.34 0.03 0.03 7.06 0.02 0.03
SVN 0.13 0.01 0.02 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02 0.01 0.00 0.05 0.01 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.01 6.77 0.01
SWE 0.02 0.20 0.05 0.08 0.02 0.03 0.26 0.01 0.12 0.90 0.04 0.03 0.00 0.01 0.01 0.01 0.01 0.03 0.01 0.06 0.00 0.02 0.05 0.12 0.03 0.02 0.02 8.05
The values represent the effect of a unitary shock in production: ie. if country AUT reduces production by one unit, the damage felt in country DEU is 0.29. Hence, the matrix can be seen as a symmetric network of production effects between countries.
My goal is to find the optimal weights of a weighted unitary shock (i.e. weighting the columns so that the total reduction of production summed over all countries is = 1) that:
ensure a certain distribution of damage across receiving (row) countries (i.e. the row sums), lets say an equal distribution
while at the same minimizing the damage in the overall economic system
I've tried solving it as a simple non-linear optimization problem with equality constraints, using the package Rsolnp:
# objective function to be minimized (global damage)
damage <- function(weights) {
D_weighted <- t(t(D)*weights); return(sum(D_weighted))
}
# constraints (combined in one function:
constr <- function(weights) {
# constraint 1: sum of weights needs to be 1
c1 = sum(weights)
# constraint 2: equal distribution in damage outcome
D_weighted <- t(t(D)*weights)
damage_per_country <- rowSums(D_weighted)/sum(D_weighted)
c2 = damage_per_country/sum(D_weighted)
return(c(c1, c2))
}
# target distribution of damage outcome (for example: equal distribution)
targ_dist <- c(rep(1/(ncol(D)), ncol(D)))
# starting weights (sart with same production reduction in every country)
startweights <- rep(1/ncol(D), ncol(D))
# run optimization with Rsolnp
opt_weights <- solnp(pars = startweights, fun = damage, eqfun = constr, eqB = c(1, targ_dist), LB = rep(0, ncol(D)), UB = rep(1, ncol(D)), control=list(outer.iter=1000,trace=0, tol= 0.001))
but it doesn't converge and returns a warning message:
"The linearized problem has no feasible solution. The problem may not be feasible".
Changing the tolerance doesn't solve the problem. It might be that this solver is not suited for this kind of problem or I need to reformulate the problem completely. I'd be thankful for any help!

How to extract value of CPU idle from sar command using AWK

From the outut of a sar command, I want to extract only the lines in which the %iowait value is higher than a set threshold.
I tried using AWK but somehow I'm not able to perform the action.
sar -u -f sa12 | sed 's/\./,/g' | awk -f" " '{ if ( $7 -gt 0 ) print $0 }'
I tried to substitute the . with , and using -gt but still no joy.
Can someone suggest a solution?
If we need entire line output of sar -u with iowait > 0.01 then, we can use this ,
Command
sar -u | grep -v "CPU" | awk '$7 > 0.01'
Output will be similar to
03:40:01 AM all 3.16 0.00 0.05 0.11 0.00 96.68
04:40:01 PM all 0.19 0.00 0.05 0.02 0.00 99.74
if wish to out specific fields, say only iowait, we can use as given below,
Command to out specific field(s),
sar -u | grep -v "CPU" | awk '{if($7 > 0.01 ) print $7}'
Output will be
0.11
0.02
Note : grep -v is used just to remove the headings in the output
Hope this helps,
My sar -u gives several lines similar to the following:
Linux 4.4.0-127-generic (v1) 06/12/2018 _x86_64_ (1 CPU)
12:00:01 AM CPU %user %nice %system %iowait %steal %idle
12:05:01 AM all 0.29 0.00 0.30 0.01 0.00 99.40
12:15:01 AM all 0.33 0.00 0.34 0.00 0.00 99.32
12:25:01 AM all 0.33 0.00 0.30 0.01 0.00 99.36
12:35:01 AM all 0.31 0.00 0.29 0.01 0.00 99.39
12:45:01 AM all 0.33 0.00 0.32 0.01 0.00 99.35
12:55:01 AM all 0.32 0.00 0.30 0.00 0.00 99.38
01:05:01 AM all 0.32 0.00 0.28 0.00 0.00 99.39
01:15:01 AM all 0.33 0.00 0.30 0.01 0.00 99.37
01:25:01 AM all 0.31 0.00 0.30 0.01 0.00 99.39
01:35:01 AM all 0.31 0.00 0.33 0.00 0.00 99.36
01:45:01 AM all 0.31 0.00 0.28 0.01 0.00 99.40
01:55:01 AM all 0.31 0.00 0.30 0.00 0.00 99.38
02:05:01 AM all 0.31 0.00 0.28 0.01 0.00 99.40
02:15:01 AM all 0.32 0.00 0.30 0.01 0.00 99.38
02:25:01 AM all 0.31 0.00 0.30 0.01 0.00 99.38
02:35:01 AM all 0.33 0.00 0.33 0.00 0.00 99.33
02:45:01 AM all 0.35 0.00 0.32 0.01 0.00 99.32
02:55:01 AM all 0.28 0.00 0.30 0.00 0.00 99.42
03:05:01 AM all 0.32 0.00 0.31 0.00 0.00 99.37
03:15:01 AM all 0.34 0.00 0.30 0.01 0.00 99.36
03:25:01 AM all 0.32 0.00 0.29 0.01 0.00 99.38
03:35:01 AM all 0.33 0.00 0.26 0.00 0.00 99.40
03:45:01 AM all 0.34 0.00 0.29 0.00 0.00 99.36
03:55:01 AM all 0.30 0.00 0.28 0.01 0.00 99.41
04:05:01 AM all 0.32 0.00 0.30 0.01 0.00 99.37
04:15:01 AM all 0.37 0.00 0.31 0.01 0.00 99.32
04:25:01 AM all 1.78 2.04 0.59 0.05 0.00 95.55
To filter out those where %iowait is greater than, let's say, 0.01:
sar -u | awk '$7>0.01{print}'
Linux 4.4.0-127-generic (v1) 06/12/2018 _x86_64_ (1 CPU)
04:25:01 AM all 1.78 2.04 0.59 0.05 0.00 95.55
05:15:01 AM all 0.34 0.00 0.32 0.02 0.00 99.32
06:35:01 AM all 0.33 0.22 1.23 4.48 0.00 93.74
06:45:01 AM all 0.16 0.00 0.12 0.02 0.00 99.71
10:35:01 AM all 0.22 0.00 0.13 0.02 0.00 99.63
12:15:01 PM all 0.42 0.00 0.16 0.03 0.00 99.40
01:45:01 PM all 0.17 0.00 0.11 0.02 0.00 99.71
04:05:01 PM all 0.15 0.00 0.12 0.03 0.00 99.70
04:15:01 PM all 0.42 0.00 0.23 0.10 0.00 99.25
Edit:
As correctly pointed out by #Ed Morton, the awk code can be shortened to simply awk '$7>0.01', since the default action is to print the current line.

R profiling spending a lot of time using .External2

I am learning how to use R profiling, and have run the Rprof command on my code.
The summaryRprof function has shown that a lot of time is spent using .External2. What is this? Additionally, there is a large proportion of the total time spent on <Anonymous>, is there a way to find out what this is?
> summaryRprof("test")
$by.self
self.time self.pct total.time total.pct
".External2" 4.30 27.74 4.30 27.74
"format.POSIXlt" 2.70 17.42 2.90 18.71
"which.min" 2.38 15.35 4.12 26.58
"-" 1.30 8.39 1.30 8.39
"order" 1.16 7.48 1.16 7.48
"match" 0.58 3.74 0.58 3.74
"file" 0.44 2.84 0.44 2.84
"abs" 0.40 2.58 0.40 2.58
"scan" 0.30 1.94 0.30 1.94
"anyDuplicated.default" 0.20 1.29 0.20 1.29
"unique.default" 0.20 1.29 0.20 1.29
"unlist" 0.18 1.16 0.20 1.29
"c" 0.16 1.03 0.16 1.03
"data.frame" 0.14 0.90 0.22 1.42
"structure" 0.12 0.77 1.74 11.23
"as.POSIXct.POSIXlt" 0.12 0.77 0.12 0.77
"strptime" 0.12 0.77 0.12 0.77
"as.character" 0.08 0.52 0.90 5.81
"make.unique" 0.08 0.52 0.16 1.03
"[.data.frame" 0.06 0.39 1.54 9.94
"<Anonymous>" 0.04 0.26 4.34 28.00
"lapply" 0.04 0.26 1.70 10.97
"rbind" 0.04 0.26 0.94 6.06
"as.POSIXlt.POSIXct" 0.04 0.26 0.04 0.26
"ifelse" 0.04 0.26 0.04 0.26
"paste" 0.02 0.13 0.92 5.94
"merge.data.frame" 0.02 0.13 0.56 3.61
"[<-.factor" 0.02 0.13 0.52 3.35
"stopifnot" 0.02 0.13 0.04 0.26
".deparseOpts" 0.02 0.13 0.02 0.13
".External" 0.02 0.13 0.02 0.13
"close.connection" 0.02 0.13 0.02 0.13
"doTryCatch" 0.02 0.13 0.02 0.13
"is.na" 0.02 0.13 0.02 0.13
"is.na<-.default" 0.02 0.13 0.02 0.13
"mean" 0.02 0.13 0.02 0.13
"seq.int" 0.02 0.13 0.02 0.13
"sum" 0.02 0.13 0.02 0.13
"sys.function" 0.02 0.13 0.02 0.13
$by.total
total.time total.pct self.time self.pct
"write.table" 5.10 32.90 0.00 0.00
"<Anonymous>" 4.34 28.00 0.04 0.26
".External2" 4.30 27.74 4.30 27.74
"mapply" 4.22 27.23 0.00 0.00
"head" 4.16 26.84 0.00 0.00
"which.min" 4.12 26.58 2.38 15.35
"eval" 3.16 20.39 0.00 0.00
"eval.parent" 3.14 20.26 0.00 0.00
"write.csv" 3.14 20.26 0.00 0.00
"format" 2.92 18.84 0.00 0.00
"format.POSIXlt" 2.90 18.71 2.70 17.42
"do.call" 1.78 11.48 0.00 0.00
"structure" 1.74 11.23 0.12 0.77
"lapply" 1.70 10.97 0.04 0.26
"FUN" 1.66 10.71 0.00 0.00
"format.POSIXct" 1.62 10.45 0.00 0.00
"[.data.frame" 1.54 9.94 0.06 0.39
"[" 1.54 9.94 0.00 0.00
"-" 1.30 8.39 1.30 8.39
"order" 1.16 7.48 1.16 7.48
"rbind" 0.94 6.06 0.04 0.26
"paste" 0.92 5.94 0.02 0.13
"as.character" 0.90 5.81 0.08 0.52
"read.csv" 0.84 5.42 0.00 0.00
"read.table" 0.84 5.42 0.00 0.00
"as.character.POSIXt" 0.82 5.29 0.00 0.00
"match" 0.58 3.74 0.58 3.74
"merge.data.frame" 0.56 3.61 0.02 0.13
"merge" 0.56 3.61 0.00 0.00
"[<-.factor" 0.52 3.35 0.02 0.13
"[<-" 0.52 3.35 0.00 0.00
"strftime" 0.48 3.10 0.00 0.00
"file" 0.44 2.84 0.44 2.84
"weekdays" 0.42 2.71 0.00 0.00
"weekdays.POSIXt" 0.42 2.71 0.00 0.00
"abs" 0.40 2.58 0.40 2.58
"unique" 0.38 2.45 0.00 0.00
"scan" 0.30 1.94 0.30 1.94
"data.frame" 0.22 1.42 0.14 0.90
"cbind" 0.22 1.42 0.00 0.00
"anyDuplicated.default" 0.20 1.29 0.20 1.29
"unique.default" 0.20 1.29 0.20 1.29
"unlist" 0.20 1.29 0.18 1.16
"anyDuplicated" 0.20 1.29 0.00 0.00
"as.POSIXct" 0.18 1.16 0.00 0.00
"as.POSIXlt" 0.18 1.16 0.00 0.00
"c" 0.16 1.03 0.16 1.03
"make.unique" 0.16 1.03 0.08 0.52
"as.POSIXct.POSIXlt" 0.12 0.77 0.12 0.77
"strptime" 0.12 0.77 0.12 0.77
"as.POSIXlt.character" 0.12 0.77 0.00 0.00
"object.size" 0.12 0.77 0.00 0.00
"as.POSIXct.default" 0.10 0.65 0.00 0.00
"Ops.POSIXt" 0.08 0.52 0.00 0.00
"type.convert" 0.08 0.52 0.00 0.00
"!=" 0.06 0.39 0.00 0.00
"as.POSIXlt.factor" 0.06 0.39 0.00 0.00
"as.POSIXlt.POSIXct" 0.04 0.26 0.04 0.26
"ifelse" 0.04 0.26 0.04 0.26
"stopifnot" 0.04 0.26 0.02 0.13
"$" 0.04 0.26 0.00 0.00
"$.data.frame" 0.04 0.26 0.00 0.00
"[[" 0.04 0.26 0.00 0.00
"[[.data.frame" 0.04 0.26 0.00 0.00
"head.default" 0.04 0.26 0.00 0.00
".deparseOpts" 0.02 0.13 0.02 0.13
".External" 0.02 0.13 0.02 0.13
"close.connection" 0.02 0.13 0.02 0.13
"doTryCatch" 0.02 0.13 0.02 0.13
"is.na" 0.02 0.13 0.02 0.13
"is.na<-.default" 0.02 0.13 0.02 0.13
"mean" 0.02 0.13 0.02 0.13
"seq.int" 0.02 0.13 0.02 0.13
"sum" 0.02 0.13 0.02 0.13
"sys.function" 0.02 0.13 0.02 0.13
"%in%" 0.02 0.13 0.00 0.00
".rs.getSingleClass" 0.02 0.13 0.00 0.00
"[.POSIXlt" 0.02 0.13 0.00 0.00
"==" 0.02 0.13 0.00 0.00
"close" 0.02 0.13 0.00 0.00
"data.row.names" 0.02 0.13 0.00 0.00
"deparse" 0.02 0.13 0.00 0.00
"factor" 0.02 0.13 0.00 0.00
"is.na<-" 0.02 0.13 0.00 0.00
"match.arg" 0.02 0.13 0.00 0.00
"match.call" 0.02 0.13 0.00 0.00
"pushBack" 0.02 0.13 0.00 0.00
"seq" 0.02 0.13 0.00 0.00
"seq.POSIXt" 0.02 0.13 0.00 0.00
"simplify2array" 0.02 0.13 0.00 0.00
"tryCatch" 0.02 0.13 0.00 0.00
"tryCatchList" 0.02 0.13 0.00 0.00
"tryCatchOne" 0.02 0.13 0.00 0.00
"which" 0.02 0.13 0.00 0.00
$sample.interval
[1] 0.02
$sampling.time
[1] 15.5

Principal component analysis (PCA) in R: why are the scores not orthogonal? (using Psych package)

I ran PCA in R using the principal() function in the "psych" package. I made the argument "rotate="none"", which asks for orthogonal rotation method. From what I understand, the scores of PC1 and PC2 should be orthogonal (i.e. there should be zero correlation between (raw data)(loading of PC1)and (raw data)(loading of PC2). However, I got 90% correlation. Why is that?
> #load the package
> library(psych)
> #calculate the correlation matrix
> corMat <- cor(data)
> #run PCA
> pca.results <- principal(r = corMat,**rotate ="none"**, nfactors = 20,covar=FALSE,scores=TRUE)
> pca.results`enter code here`
Principal Components Analysis
Call: principal(r = corMat, nfactors = 20, rotate = "none", covar = FALSE,
scores = TRUE)
Standardized loadings (pattern matrix) based upon correlation matrix
**PC1 PC2** PC3 PC4 PC5 PC6 PC7 PC8 PC9
payroll.chg -0.30 0.85 0.21 0.35 -0.03 0.02 0.07 -0.11 -0.02
HH.empl.chg -0.26 0.62 0.64 -0.35 0.01 -0.06 0.06 0.00 0.01
pop.empl.ratio -0.92 -0.34 0.13 0.04 0.06 -0.03 -0.04 0.03 -0.04
u.rate 0.99 0.10 0.02 0.04 0.01 0.04 0.04 0.04 0.01
median.duration.unempl 0.88 0.44 -0.02 0.02 -0.04 0.06 0.02 0.13 -0.05
LT.unempl.unempl.ratio 0.86 0.49 -0.04 0.01 -0.07 0.02 0.00 0.08 -0.02
U4 0.99 0.13 0.01 0.03 0.01 0.04 0.04 0.05 0.01
U6 0.98 0.13 -0.05 -0.02 0.00 0.06 0.04 0.03 0.04
vacancy.rate -0.87 0.35 -0.18 -0.11 -0.01 0.22 0.10 0.03 -0.01
hires.rate -0.92 0.08 0.24 0.21 -0.16 0.06 0.00 0.05 0.09
unemployed.to.employed 0.89 0.17 0.21 -0.02 0.05 0.24 -0.25 -0.05 0.00
Layoff.rate..JOLT. 0.23 -0.86 0.19 -0.03 -0.40 0.09 0.03 -0.02 -0.05
Exhaustion.rate 0.95 0.19 0.14 0.14 0.00 -0.07 0.01 0.06 -0.04
Quits.rate..JOLT. -0.98 0.01 0.04 0.04 0.01 0.02 -0.06 0.10 0.13
participation.rate -0.67 -0.61 0.31 0.14 0.16 -0.01 -0.03 0.11 -0.08
insured.u.rate 0.88 -0.40 0.17 0.08 0.12 0.05 0.09 -0.03 0.02
Initial.jobless.claims 0.78 -0.60 0.04 -0.06 0.06 0.05 0.07 0.02 0.07
Continuing.claims 0.86 -0.44 0.15 0.06 0.14 0.08 0.09 -0.05 0.03
Jobs.plentiful.jobs.hardtoget -0.98 0.00 -0.02 0.01 0.08 0.13 0.04 -0.02 -0.04
vacancy.unempl.ratio -0.97 0.04 -0.05 -0.03 0.08 0.18 0.07 0.03 -0.03
PC10 PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18
payroll.chg -0.06 0.02 -0.02 0.00 0.03 0.00 0.00 0.00 0.00
HH.empl.chg 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
pop.empl.ratio -0.02 0.00 -0.01 0.01 0.00 0.00 0.00 0.01 0.01
u.rate -0.01 0.00 0.03 -0.03 0.02 0.00 0.00 -0.01 -0.01
median.duration.unempl 0.02 0.05 -0.06 -0.01 -0.03 0.01 -0.02 0.00 0.00
LT.unempl.unempl.ratio 0.01 0.02 -0.01 0.02 0.00 0.00 0.05 0.00 0.00
U4 -0.01 0.00 0.04 -0.02 0.02 0.00 -0.01 -0.01 0.01
U6 -0.01 0.01 0.03 -0.03 0.02 -0.02 0.00 0.03 0.00
vacancy.rate -0.08 -0.06 0.01 0.01 -0.01 0.04 0.00 0.00 0.00
hires.rate 0.01 0.00 0.04 0.00 -0.06 -0.01 0.00 0.00 0.00
unemployed.to.employed -0.01 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00
Layoff.rate..JOLT. 0.01 0.00 -0.01 -0.01 0.03 0.00 0.00 0.00 0.00
Exhaustion.rate 0.05 -0.07 0.02 0.06 0.01 -0.01 -0.02 0.00 0.00
Quits.rate..JOLT. 0.04 -0.01 -0.04 0.00 0.05 0.02 0.00 0.00 0.00
participation.rate -0.06 0.00 0.02 -0.02 0.01 0.01 0.01 0.00 0.00
insured.u.rate 0.04 -0.02 -0.02 0.00 -0.02 0.02 0.01 0.00 0.02
Initial.jobless.claims -0.09 0.06 0.00 0.06 0.01 -0.01 -0.01 0.00 0.00
Continuing.claims 0.05 -0.02 -0.02 -0.02 -0.01 0.01 0.01 0.01 -0.02
Jobs.plentiful.jobs.hardtoget 0.11 0.07 0.05 0.02 0.01 0.02 0.00 0.00 0.00
vacancy.unempl.ratio 0.03 -0.01 -0.03 0.00 0.01 -0.06 0.00 0.00 0.00
PC19 PC20 h2 u2
payroll.chg 0.00 0.00 1 5.6e-16
HH.empl.chg 0.00 0.00 1 -2.9e-15
pop.empl.ratio 0.01 0.01 1 -1.6e-15
u.rate -0.01 0.01 1 1.1e-16
median.duration.unempl 0.00 0.00 1 -4.4e-16
LT.unempl.unempl.ratio 0.00 0.00 1 -6.7e-16
U4 0.01 0.00 1 -4.4e-16
U6 0.00 0.00 1 2.2e-16
vacancy.rate 0.00 0.00 1 0.0e+00
hires.rate 0.00 0.00 1 4.4e-16
unemployed.to.employed 0.00 0.00 1 -2.2e-16
Layoff.rate..JOLT. 0.00 0.00 1 -2.2e-15
Exhaustion.rate 0.00 0.00 1 -4.4e-16
Quits.rate..JOLT. 0.00 0.00 1 1.1e-16
participation.rate 0.00 -0.01 1 5.6e-16
insured.u.rate -0.01 0.00 1 -6.7e-16
Initial.jobless.claims 0.00 0.00 1 -2.0e-15
Continuing.claims 0.01 0.00 1 -6.7e-16
Jobs.plentiful.jobs.hardtoget 0.00 0.00 1 2.2e-16
vacancy.unempl.ratio 0.00 0.00 1 -2.2e-16
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12
SS loadings 14.23 3.73 0.83 0.37 0.28 0.20 0.12 0.07 0.05 0.05 0.02 0.02
Proportion Var 0.71 0.19 0.04 0.02 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00
Cumulative Var 0.71 0.90 0.94 0.96 0.97 0.98 0.99 0.99 0.99 1.00 1.00 1.00
Proportion Explained 0.71 0.19 0.04 0.02 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00
Cumulative Proportion 0.71 0.90 0.94 0.96 0.97 0.98 0.99 0.99 0.99 1.00 1.00 1.00
PC13 PC14 PC15 PC16 PC17 PC18 PC19 PC20
SS loadings 0.01 0.01 0.01 0 0 0 0 0
Proportion Var 0.00 0.00 0.00 0 0 0 0 0
Cumulative Var 1.00 1.00 1.00 1 1 1 1 1
Proportion Explained 0.00 0.00 0.00 0 0 0 0 0
Cumulative Proportion 1.00 1.00 1.00 1 1 1 1 1
Test of the hypothesis that 20 components are sufficient.
The degrees of freedom for the null model are 190 and the objective function was 68.46
The degrees of freedom for the model are -20 and the objective function was 0
Fit based upon off diagonal values = 1
To find the component scores you can skip the step in which you are finding the correlations. principal will do that for you. Then, you can skip the step Hong Ooi suggested andjust find the scores directly. They should be orthogonal.
Using your example:
pca.results <- principal(data,nfactors=20,rotate='none')
#then correlate the scores
cor(pca.results$scores) #these should be orthogonal
Bill
What you've got there are not the PCA scores, but the PCA loadings. To get the latter, use the predict method on your model. You should find that the predicted scores are indeed uncorrelated with each other.

Resources