hello i created the following functions that test reliability. however i want to combine them into one function like :reliability<-function(x)in order for them to give me a 1-0 matris showing each answer for each function with in "reliability". because each has been giving me the answer on its own. if any ideas help.
splithalf1<- function( data ) {
n<-ncol(data)
tek<-data[ , seq(1,n , 2)]
cift<-data[ , seq(2 ,n , 2)]
top_single<-rowSums(single)
top_double<-rowSums(double)
kor<-cor(top_single,top_double)
r<-2*kor / (1+kor)
return(r)
}
cr.alpha2<- function(x) {
n<-ncol(x)
kov<-cov(x)
kov1<-as.vector(kov)
kov2<-unique (kov1)
kov3<- kov2[-1]
kov4<-sum(kov3)/length(kov3)
pay<- n*kov4
payd<- (1 + (n-1)*kov4)
alpha<-pay/payd
return(alpha)
}
kr20<-function(x) {
n<-ncol(x)
pq<-function(x) {
p<-mean(x)
q<-1-p
res<-p*q
return (res)
}
pay<- sum(apply(x,2,pq))
top<-rowSums(x)
payda<-var(top)
result<- n /(n-1)* (1-(pay/payda))
return(result)
}
Stack is not a coding service! As a teaching service to you, however, I will suggest several things which is probably the intent of your teacher in the first place!
Study up on fundamental R. There are innumerable (well numerable but innumerable for all practical individual purposes) resources for free on the net. One good (IMO), free, intro-to-journeyman-level book is R for Data Science which can be accessed here: https://r4ds.had.co.nz/
check out assigning function return values to vars.
check out the c() and matrix() functions.
Related
I have been given following code for R, but I am having trouble understanding what it is doing. In fact I can not even run it in R because of its syntax. I assume the syntax is for lower level code behind R. If someone could help explain what's happening here and translate this into executable R code that would be very helpful.
soft_thresholding = function(x,a){
result a)] a)] - a
result[which(x < -a)] = x[which(x < -a)] + a
return(result)}
Here is a summary of the findings. This is not a definite answer but could help the questioner.
If one uses wordpress, then x <- a will look like x < -a. Check this URL that confirms this assumption
Upon further online search with the function name in the question "soft_thresholding", shows that this function is probably attempting to do soft thresholding defined here.
Some more online searching about soft thresholding lands on a CRAN package that is present here.
Further deepdive into the r folder in the package binaries shows the following.
soft.threshold <- function(x,sumabs=1)
return(soft(x, BinarySearch(x,sumabs)))
The function above seems very close to the code in the question.
Furthermore, the soft.threshold function uses another internal function BinarySearch that looks like this.
BinarySearch <-
function(argu,sumabs){
if(norm2(argu)==0 || sum(abs(argu/norm2(argu)))<=sumabs) return(0)
lam_max = max(abs(argu))
lam1 <- 0
lam2 <- lam_max
iter <- 1
while(iter < 500){
su <- soft(argu,(lam1+lam2)/2)
if(sum(abs(su/norm2(su)))<sumabs){
lam2 <- (lam1+lam2)/2
} else {
lam1 <- (lam1+lam2)/2
}
if((lam2-lam1)/lam1 < 1e-10){
if (lam2 != lam_max){
return(lam2)
}else{
return(lam1)
}
}
iter <- iter+1
}
warning("Didn't quite converge")
return((lam1+lam2)/2)
}
This recursive research leads one to believe that the function is perhaps attempting to mimic the function soft.threshold in the CRAN package "RGCCA"
Hope it helps
i tried updating data in dataframe but its unable to get updating
//Initialize data and dataframe here
user_data=read.csv("train_5.csv")
baskets.df=data.frame(Sequence=character(),
Challenge=character(),
countno=integer(),
stringsAsFactors=FALSE)
/Updating data in dataframe here
for(i in 1:length((user_data)))
{
for(j in i:length(user_data))
{
if(user_data$challenge_sequence[i]==user_data$challenge_sequence[j]&&user_data$challenge[i]==user_data$challenge[j])
{
writedata(user_data$challenge_sequence[i],user_data$challenge[i])
}
}
}
writedata=function( seqnn,challng)
{
#print(seqnn)
#print(challng)
newRow <- data.frame(Sequence=seqnn,Challenge=challng,countno=1)
baskets.df=rbind(baskets.df,newRow)
}
//view data here
View(baskets.df)
I've modified your code to what I believe will work. You haven't provided sample data, so I can't verify that it works the way you want. I'm basing my attempt here on a couple of common novice mistakes that I'll do my best to explain.
Your writedata function was written to be a little loose with it's scope. When you create a new function, what happens in the function technically happens in its own environment. That is, it tries to look for things defined within the function, and then any new objects it creates are created only within that environment. R also has this neat (and sometimes tricky) feature where, if it can't find an object in an environment, it will try to look up to the parent environment.
The impact this has on your writedata function is that when R looks for baskets.df in the function and can't find it, R then turns to the Global Environment, finds baskets.df there, and then uses it in rbind. However, the result of rbind gets saved to a baskets.df in the function environment, and does not update the object of the same name in the global environment.
To address this, I added an argument to writedata that is simply named data. We can then use this argument to pass a data frame to the function's environment and do everything locally. By not making any assignment at the end, we implicitly tell the function to return it's result.
Then, in your loop, instead of simply calling writedata, we assign it's result back to baskets.df to replace the previous result.
for(i in 1:length((user_data)))
{
for(j in i:length(user_data))
{
if(user_data$challenge_sequence[i] == user_data$challenge_sequence[j] &&
user_data$challenge[i] == user_data$challenge[j])
{
baskets.df <- writedata(baskets.df,
user_data$challenge_sequence[i],
user_data$challenge[i])
}
}
}
writedata=function(data, seqnn,challng)
{
#print(seqnn)
#print(challng)
newRow <- data.frame(Sequence = seqnn,
Challenge = challng,
countno = 1)
rbind(data, newRow)
}
I'm not sure what you're programming background is, but your loops will be very slow in R because it's an interpreted language. To get around this, many functions are vectorized (which simply means that you give them more than one data point, and they do the looping inside compiled code where the loops are fast).
With that in mind, here's what I believe will be a much faster implementation of your code
user_data=read.csv("train_5.csv")
# challenge_indices will be a matrix with TRUE at every place "challenge" and "challenge_sequence" is the same
challenge_indices <- outer(user_data$challenge_sequence, user_data$challenge_sequence, "==") &
outer(user_data$challenge, user_data$challenge, "==")
# since you don't want duplicates, get rid of them
challenge_indices[upper.tri(challenge_indices, diag = TRUE)] <- FALSE
# now let's get the indices of interest
index_list <- which(challenge_indices,arr.ind = TRUE)
# now we make the resulting data set all at once
# this is much faster, because it does not require copying the data frame many times - which would be required if you created a new row every time.
baskets.df <- with(user_data, data.frame(
Sequence = challenge_sequence[index_list[,"row"]],
challenge = challenge[index_list[,"row"]]
)
Sorry for trivial question, but I`m not a programmer. Do I transformed the following tasks in the form of R function OK?
I have recurrence equations, e.g.(p1_par,...,p4_par-parameters to find):
z1[i+1]= z1[i]+p1_par*p2_par
z12[i+1]= z12[i]+(p1_par*z1[i]-p3_par*z1z2[i]-p4_par)*p2_par
z1z2[i+1]=z1z2[i]+(-p3_par*z12[i]-p4_par*z1z2[i])*p2_par
i=1,...,5
with the initial conditions for i=0:
z1_0=1.23
z12_0=1
z1z2_0=0
and t=6, y=c(0.1,0.06,0.08,0.04,0.05,0.01)
I want to find parameters based on min value of function e.g. like this:
(-2*p1_par*z1[i]-z12[i]+y[i+1]^2+2*p3_par*z1z2[i]+2*p4_par*z1z3[i])^2
I try to build the function in R like:
function1=function(p1_par,p2_par,p3_par,p4_par,y,t){
ep=1
summa=0
result=rep(1,t)
for(i in 1:t){
z1_0=1.23
z12_0=1
z1z2_0=0
z1[1]=z1_0+p1_par*p2_par
z12[1]=z12_0+(p1_par*z1_0-*p3_par*z1z2_0-*p4_par)*p2_par
z1z2[1]=z1z2_0+(-p3_par*z12_0-p4_par*z1z2_0)*p2_par
z1[i+1]= z1[i]+p1_par*p2_par
z12[i+1]= z12[i]+p1_par*z1[i]-p3_par*z1z2[i]-p4_par)*p2_par
z1z2[i+1]=z1z2[i]+(-p3_par*z12[i]-p4_par*z1z2[i])*p2_par
if(i==1) {
result[ep]=(-2*p1_par*z1_0-z12_0+y[i+1]^2+2*p3_par*z1z2_0+2*p4_par*z1z3_0)^2
} else {
result[ep]=(-2*p1_par*z1[i]-z12[i]+y[i+1]^2+2*p3_par*z1z2[i]+2*p4_par*z1z3[i])^2
}
summa<<-summa+result[ep]
ep=ep+1
}
return(result)
}
Do I transformed task of the R function correct? Results from other softwares (like Math) differs. Thanks in advance for help.
PPS
I hope everyone is well; I have a question it is may be looked as a dumb one but I really need someone to explain it for me. I also though it will be useful for some, since it has been asked before with no satisfactory answer.
Since , I have mixed data type matrix, I was looking for K-nearst neighbors algorithem that works with gower distance in R. I found the function Knngow under the package dprep that claims to perform this.
http://finzi.psych.upenn.edu/library/dprep/html/knngow.html
The function take three argument knngow( Training_Set, Testing_set, K_number) and return the predicted class.
I was playing around with it and was wondering how the function can recognize what is my target vector? Put differently, how does it return the predicted class, without me acknowledging it in advance with my target column.
please find the source code below ( I retrieved it using the function edit)
function (train, test, k)
{
p = dim(train)[2]
ntest = dim(test)[1]
ntrain = dim(train)[1]
classes = rep(0, ntest)
if (ntest == ntrain) {
for (i in 1:ntest) {
tempo = order(gower.dist(test[i, -p], train[-i,
-p]))[1:k]
classes[i] = moda(train[tempo, p])[1]
}
}
else {
for (i in 1:ntest) {
tempo = order(StatMatch::gower.dist(test[i, -p],
train[, -p]))[1:k]
classes[i] = moda(train[tempo, p])[1]
}
}
classes
}
please can someone explain for me the code?
I hope I have post the question in the correct form, please let me know if I have to move it to somewhere else.
Thank you very much for your time.
knngow function takes the last column of the train as the target attribute. Also p = dim(train)[2]) indicates your column number.
Column p (the last column of your training data) is not used for calculating Gower dist. It is only taken into account when it comes to predict the class label of test samples.
this is my first R Code, and it is a very simple deduplication, but it is working so slowly I can't believe it! My question is: Is it normal that it is working so slowly or is my code just bad?
Here it is:
file1=c(read.delim("file.txt", header=TRUE))
dedupes<-0
i<-1
n<-1
while (i<=100) {
while (n<=100) {
if (file1$email[i]==file1$email[n] && i!=n) {
#Remember amount of deduces
dedupes=dedupes+1
#Show dedupes
print(file1$email[i]) }
n<-n+1
}
n<-1
i<-i+1
}
#Show amount of dedupes
cat("There are ", dedupes/2, " deduces")
Many thanks in advance,
Saitam
Imbricated loops are well known to be slow in R. You need to vectorize your calculus or use existing optimized functions such as in the suggestion of BondedDust