I'm trying to substitute some characters by some strings, but when I try this happens:
Group <- "ABC"
A <- "0.25 0.65 0.48"
B <- "0.054 0.41 0.09"
C <- "0.8 0.047 0.34"
Group <- gsub("A", A, Group)
Group <- gsub("B", B, Group)
Group <- gsub("C", C, Group)
Group
When I group them there is no space between A, B and C. The above code results in:
0.25 0.65 0.480.054 0.41 0.090.8 0.047 0.34
I want that the input be like this:
0.25 0.65 0.48 0.054 0.41 0.09 0.8 0.047 0.34
I will appreciate if you can help me with this.
There are several syntactical errors, but let me present you what I think you are trying to accomplish:
Group <- 'ABC'
A <- paste(0.25, 0.65, 0.48)
Group = gsub('A', A, Group)
[1] "0.25 0.65 0.48BC"
EDIT: Seeing your reformatted question, I would say the only change is to put a space between your Group letters:
Group <- 'A B C'
Or paste an empty character at the end of all groups of numbers:
A <- paste(0.25, 0.65, 0.48, "")
You can transform Group a bit, i.e., trimsw(gsub(""," ",Group)), then " " is inserted among characters in Group.
just use paste with collapse = "":
A <- "0.25 0.65 0.48"
B <- "0.054 0.41 0.09"
C <- "0.8 0.047 0.34"
paste(A, B, C, collaspe = "")
"0.25 0.65 0.48 0.054 0.41 0.09 0.8 0.047 0.34 "
Related
I am trying to make a ROC Curve using pROC with the 2 columns as below: (the list goes on to over >300 entries)
Actual_Findings_%
Predicted_Finding_Prob
0.23
0.6
0.48
0.3
0.26
0.62
0.23
0.6
0.48
0.3
0.47
0.3
0.23
0.6
0.6868
0.25
0.77
0.15
0.31
0.55
The code I tried to use is:
roccurve<- plot(roc(response = data$Actual_Findings_% <0.4, predictor = data$Predicted_Finding_Prob >0.5),
legacy.axes = TRUE, print.auc=TRUE, main = "ROC Curve", col = colors)
Where the threshold for positive findings is
Actual_Findings_% <0.4
AND
Predicted_Finding_Prob >0.5
(i.e to be TRUE POSITIVE, actual_finding_% would be LESS than 0.4, AND predicted_finding_prob would be GREATER than 0.5)
but when I try to plot this roc curve, I get the error:
"Setting levels: control = FALSE, case = TRUE
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'plot': Predictor must be numeric or ordered."
Any help would be much appreciated!
This should work:
data <- read.table( text=
"Actual_Findings_% Predicted_Finding_Prob
0.23 0.6
0.48 0.3
0.26 0.62
0.23 0.6
0.48 0.3
0.47 0.3
0.23 0.6
0.6868 0.25
0.77 0.15
0.31 0.55
", header=TRUE, check.names=FALSE )
library(pROC)
roccurve <- plot(
roc(
response = data$"Actual_Findings_%" <0.4,
predictor = data$"Predicted_Finding_Prob"
),
legacy.axes = TRUE, print.auc=TRUE, main = "ROC Curve"
)
Now importantly - the roc curve is there to show you what happens when you varry your classification threshold. So one thing you do do wrong is to go and enforce one, by setting predictions < 0.5
This does however give a perfect separation, which is nice I guess. (Though bad for educational purposes.)
Suppose we have the following list structure:
bla <- list(lda = list(list(auc1 = 0.85, auc2 = 0.56), list(auc1 = 0.65, auc2 = 0.72)),
j48 = list(list(auc1 = 0.99, auc2 = 0.81), list(auc1 = 0.61, auc2 = 0.85)),
c50 = list(list(auc1 = 0.92, auc2 = 0.59), list(auc1 = 0.68, auc2 = 0.80)))
The desired output is a data frame structured as:
auc1 auc2
lda 0.85 0.56
lda 0.65 0.72
j48 0.99 0.81
j48 0.61 0.85
c50 0.92 0.59
c50 0.68 0.80
My attempt is pasted below. I'm able to purrr each inner list separately using the call:
bla[[1]] %>%
map(., function(x) c(auc1 = x[["auc1"]],
auc2 = x[["auc2"]])) %>%
map_dfr(., as.list)
Any idea is appreciated.
Surprisingly only using bind_rows gives you what you want.
dplyr::bind_rows(bla)
This returns a tibble and tibbles don't have rownames.
You can do this in base R using do.call + rbind.
do.call(rbind, unlist(bla, recursive = FALSE))
# auc1 auc2
#lda1 0.85 0.56
#lda2 0.65 0.72
#j481 0.99 0.81
#j482 0.61 0.85
#c501 0.92 0.59
#c502 0.68 0.8
We can also use rbindlist
library(data.table)
rbindlist(bla)
Given I have 4 different values
intensities <- c(0.1,-0.1,0.05,-0.05)
My goal is to randomly sample every value 5 times but positive and negative values should alternate, e.g.
resultingList = (0.1, -0.05, 0.05, -0.05, 0.1, -0.1, ...)
Does anybody know an elegant way to do this in R?
Maybe something like this
# seed
set.seed(123)
plus <- rep(intensities[intensities >= 0], each = 5)
minus <- rep(intensities[intensities < 0], each = 5)
out <- numeric(length(plus) + length(minus))
out[seq(1, length(out), 2)] <- sample(plus)
out[seq(2, length(out), 2)] <- sample(minus)
out
# [1] 0.10 -0.05 0.05 -0.10 0.10 -0.05 0.05 -0.05 0.05 -0.10 0.10 -0.05 0.05 -0.05 0.05 -0.10
# [17] 0.10 -0.10 0.10 -0.10
If your list of intensities that you are sampling from come in +/- pairs, you could just sample from the list of positive values then change the sign of every other number drawn:
N <- 5
positiveIntensities <- c(0.1, 0.05)
resultingList <- sample(positiveIntensities,N,replace = T) * (-1)^(0:(N-1))
It's my solution, which creates a custom function and the argument n means the length of output. In addition, ceiling() and floor() can decide the lengths of odd and even positions.
mySample <- function(x, n){
res <- c()
res[seq(1, n, 2)] <- sample(x[x >= 0], ceiling(n / 2), T)
res[seq(2, n, 2)] <- sample(x[x < 0], floor(n / 2), T)
return(res)
}
intensities <- c(0.1, -0.1, 0.05, -0.05)
mySample(intensities, 10)
# [1] 0.10 -0.10 0.05 -0.05 0.10 -0.05 0.05 -0.05 0.05 -0.10
I'm using the psych package for factor analysis. I want to specify the labels of the latent factors, either in the fa() object, or when graphing with fa.diagram().
For example, with toy data:
require(psych)
n <- 100
choices <- 1:5
df <- data.frame(a=sample(choices, replace=TRUE, size=n),
b=sample(choices, replace=TRUE, size=n),
c=sample(choices, replace=TRUE, size=n),
d=sample(choices, replace=TRUE, size=n))
model <- fa(df, nfactors=2, fm="pa", rotate="promax")
model
Factor Analysis using method = pa
Call: fa(r = df, nfactors = 2, rotate = "promax", fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 h2 u2 com
a 0.45 -0.49 0.47 0.53 2.0
b 0.22 0.36 0.17 0.83 1.6
c -0.02 0.20 0.04 0.96 1.0
d 0.66 0.07 0.43 0.57 1.0
I want to change PA1 and PA2 to FactorA and FactorB, either by changing the model object itself, or adjusting the labels in the output of fa.diagram():
The docs for fa.diagram have a labels argument, but no examples, and the experimentation I've done so far hasn't been fruitful. Any help much appreciated!
With str(model) I found the $loadings attribute, which fa.diagram() uses to render the diagram. Modifying colnames() of model$loadings did the trick.
colnames(model$loadings) <- c("FactorA", "FactorB")
fa.diagram(model)
I have a dataframe and I need to calculate log for all numbers greater than 0 and log1p for numbers equal to 0. My dataframe is called tcPainelLog and is it like this (str from columns 6:8):
$ IDD: num 0.04 0.06 0.07 0.72 0.52 ...
$ Soil: num 0.25 0.22 0.16 0.00 0.00 ...
$ QAI: num 0.00 0.50 0.00 0.71 0.26 ...
Therefore, I guess need to concatenate an ifelse statement with log and log1p functions. However, I tried several different ways to do it, but none has succeeded. For instance:
tcPainelLog <- tcPainel
cols <- names(tcPainelLog[,6:17]) # These are the columns I need to calculate
tcPainelLog$IDD <- ifelse(X = tcPainelLog$IDD>0, log(X), log1p(X))
tcPainelLog[cols] <- lapply(tcPainelLog[cols], function(x) ifelse((x > 0), log(x), log1p(x)))
tcPainelLog[cols] <- if(tcPainelLog[,cols] > 0) log(.) else log1p(.)
I haven't been able to perform it and I would appreciate any help for that. I am really sorry it there is an explanation for that, I searched by many words but I didn't find it.
Best regards.