while loop problem in r - r

i am trying to get this loop in my r program to work but it is not giving me the results that I desire. I am trying to model an insurance contract where there are n securities that have a fixed likelihood of default vector(data[i,2]) and a payout vector(data[i,1]).
i need to price the value of stop losses at the security level and at the portfolio level. to do this i created two while loops for the conditional vectors of each level (which will be inputed into the function by the user) one while loop to scan through the various securities and a final one to model the various scenarios. i tried to Use R's matrix capabilities to help organize the results.
the problem with this code is that the if statement behaves oddly, not activating and filtering correctly. this causes the program to be slow and provide bad results. it fills the individual protection column always rather than conditioning it on the likelihood vector(data[i,2]). there is a lot of moving parts but overall it is a simple model.
y = years
nr=nrow(data1)
nc=ncol(data1)
isl = individualStopLoss
asl = aggregateStoploss
Lasl = length(asl)
LIsl = length(isl)
claims = vector(mode = "logical",length= asl)
individualProtection = matrix(0,ncol=LIsl,nrow=y)
aggregateProtection = matrix(0,ncol=Lasl ,nrow=y)
expectedClaims = data1[,1]*data1[,2]
expectedClaims = sum(expectedClaims)
k = 1
m=1
while (k<=y)
{j = 1
m = 1
runi = runif(nr, min=0, max=1)
while (m<=Lasl)
{while (j<=LIsl)
{i=1
while (i<=nr)
{if ( runi[i] < data1[i,2] )
{individualProtection[k,j] = individualProtection[k,j] + max(data1[i,1]-isl[j],0)
claims[k] = claims[k] + data1[i,1]
i=i+1
}
else{i= i+1}
}
j=j+1
}
aggregateProtection[k,m]= aggregateProtection[k,m] + max(claims[k] - expectedClaims*asl[m],0)
m = m+1
}
k = k+1
}

Just an example to help you provide a reproducible example, will be deleted when your question is updated.
data1 <- cbind(rnorm(1000),rnorm(1000))
y = sample(rep(1990:2011,1000),1000)
nr=nrow(data1)
nc=ncol(data1)
isl = rnorm(500)
asl = rnorm(500)
Lasl = length(asl)
LIsl = length(isl)

Related

For loops to make leave-one-out analysis with netmeta

I'm doing a network metanalysis of 29 studies using the netmeta package with R and I now have to do the leave-one-out analysis. I was thus wondering whether there is a way to use for loops to gain the results of a such method in order not to do it by manually excluding one trial at a time.
I came up with this:
for (i in 1:29){
NMA_DB_L<-NMA_DB[-i,]
yi_All_cause<-summary(escalc(ai= NMA_DB_L$All_Cause_d_C, bi=NMA_DB_L$PTS_All_Cause_d_C - NMA_DB_L$All_Cause_d_C,
ci= NMA_DB_L$All_Cause_d_I, di= NMA_DB_L$PTS_All_Cause_d_I - NMA_DB_L$All_Cause_d_I,
measure = "RR"))[,"yi"]
sei_All_cause<-summary(escalc(ai= NMA_DB_L$All_Cause_d_C, bi=NMA_DB_L$PTS_All_Cause_d_C - NMA_DB_L$All_Cause_d_C,
ci= NMA_DB_L$All_Cause_d_I, di= NMA_DB_L$PTS_All_Cause_d_I - NMA_DB_L$All_Cause_d_I,
measure = "RR"))[,"sei"]
netmeta(TE=yi_All_cause, seTE = sei_All_cause, treat1 = NMA_DB_L$Arm_1, treat2 = NMA_DB_L$INT, sm="RR",
studlab = NMA_DB_L$Study, reference.group = "Standard_DAPT")
}
and it seems to work properly, but I cannot find a way to save the results of each analysis without one of the trials.
Does anyone have an idea of how to do so?
Consider also lapply (to avoid bookkeeping of initializing a list and assign in for loop by index). Also, use a defined method and avoid rerunning summary + escalc just to retrieve attributes. Run it once and extract attributes as needed.
# DEFINED METHOD TO RUN CALCULATIONS
# AVOID DRY (I.E., DON'T REPEAT YOURSELF)
run_trials <- function(i) {
NMA_DB_L <- NMA_DB[-i,]
results <- summary(escalc(
ai = NMA_DB_L$All_Cause_d_C,
bi = NMA_DB_L$PTS_All_Cause_d_C - NMA_DB_L$All_Cause_d_C,
ci = NMA_DB_L$All_Cause_d_I,
di = NMA_DB_L$PTS_All_Cause_d_I - NMA_DB_L$All_Cause_d_I,
measure = "RR"
))
yi_All_cause <- results[,"yi"]
sei_All_cause <- results[,"sei"]
netmeta(
TE = yi_All_cause,
seTE = sei_All_cause,
treat1 = NMA_DB_L$Arm_1,
treat2 = NMA_DB_L$INT, sm="RR",
studlab = NMA_DB_L$Study,
reference.group = "Standard_DAPT"
)
}
# BUILD LIST OF RESULTS
netmeta_results <- lapply(1:29, run_trials)
Why not save the outputs of netmeta function into a list?
# Create list of length 29
net_results <- vector('list', 29)
for (i in 1:29) {
NMA_DB_L<-NMA_DB[-i,]
...
net <- netmeta(TE=yi_All_cause, seTE = sei_All_cause,
treat1 = NMA_DB_L$Arm_1, treat2 = NMA_DB_L$INT, sm="RR",
studlab = NMA_DB_L$Study, reference.group = "Standard_DAPT")
net_results[[i]] <- net
}
You can then access results of the specific run with net_results[[1]] etc.
R lists can in general contain any type of element which makes it a suitable structure for this type of problems.

Listing All Variables (Column Names) in R Shiny's checkboxGroupInput

I'm writing an R shiny application. I'm facing much trouble, particularly the checkboxGroupInput function. I'm hoping that I will be able to create a dynamic list that will automatically list down all columns except the first column, source_file$Date of a dataset named source_file, and I'm not entirely sure on it. Would greatly appreciate any help you can provide!
Sample dataset of source_file would look something like this:
Date
Index 1
Index 2
Index 3
Index 4
Index 5
2016-01-01
+5%
-2%
+5%
+10%
+12%
2016-01-08
+3%
+13%
-8%
-3%
+10%
2016-01-15
+2%
+11%
-3%
+4%
-15%
The end goal is that I hope the checkboxGroupInput function will be able to automatically read all columns starting from the second column (ignore Date). In this case, the check box would load up 5 options, Index 1 to Index 5. It should be replicable such that it can load any number of indexes depending on the data specified. I tried hard-coding each individual index in but it's definitely counter-intuitive and so frustrating to do.
tabPanel("Target Volatility Portfolio",
sidebarPanel(
tags$h3("Find an optimised portfolio to achieve maximum return for a given level of risk/volatility"),
tags$h4("Input:"),
checkboxGroupInput("portfolio_selection",
"Select Number of Indexes for Portfolio",
choices = list(#####please send help here#####)
Edits: Would appreciate if you could help me fix this.
I want to reference the output that comes from the checkbox into my global.R in this format. Basically, I want to use the selected variables to plot a graph. A selection of 2 variables will result in a graph plotting a graph related to the 2 variables, whereas a selection of 10 variables will create a plot involving all 10 variables. (I'm basically plotting the efficient market frontier of x number of stocks where x is the number of variables selected. Its a little hard to explain but I hope attaching the code can provide you some insight) The hashed line is what I need help fixing. Thank you!
plot_emf = function(n_points, target_vol, portfolio_selection)
{
first <- portfolio_selection[1]
last <- portfolio_selection[length(portfolio_selection)]
#######asset_returns = source_file[first:last]########
# Extract necessary parameters
n_assets = ncol(asset_returns)
n_obs = nrow(asset_returns)
n_years = n_obs / 52
# Initialize containers for holding return and vol simulations
return_vector = c()
vol_vector = c()
sharpe_vector = c()
for (i in 1:n_points)
{
# Generate random weights for n assets from uniform(0,1)
asset_weights = runif(n_assets, min = 0, max = 1)
normalization_ratio = sum(asset_weights)
# Asset weights need to add up to 100%
asset_weights = asset_weights / normalization_ratio
# print(asset_weights)
# print(asset_returns)
# Generate the portfolio return vector using these weights
random_portfolio_returns = emf_portfolio_returns(
asset_weights,
asset_returns)
# print(random_portfolio_returns)
# plot_returns_histogram(random_portfolio_returns$portfolio_returns)
cumulative_return = calculate_cumulative_return(random_portfolio_returns$portfolio_returns)
annualized_return = 100*((1 + cumulative_return/100)^(1/n_years) - 1)
annualized_vol = sd(random_portfolio_returns$portfolio_returns)*(52^0.5)
sharpe = annualized_return / annualized_vol
return_vector = append(return_vector, annualized_return)
vol_vector = append(vol_vector, annualized_vol)
sharpe_vector = append(sharpe_vector, sharpe)
#print(paste("Asset weights:",asset_weights))
#print(paste("Anualized return:",annualized_return))
#print(paste("Annualized vol:",annualized_vol))
}
g = ggplot(data = data.frame(vol_vector, return_vector, sharpe_vector),
aes(x = vol_vector, y = return_vector, color = sharpe_vector)) +
scale_color_gradient(low = "red", high = "blue", name = "Sharpe Ratio\n(Return/Risk)") +
ggtitle("Efficient Market Frontier") +
xlab("Annualized Vol (%)") +
ylab("Annualized Return (%)") +
theme(plot.title = element_text(hjust=0.5)) + geom_vline(xintercept=target_vol) +
geom_point()
print(g)
}
You can try something like the following which uses colnames() to extract the new choices, and then updates the checkboxGroupInput with updateCheckboxGroupInput():
server <- function(input, output, session) {
# Read the data once per session - this step might be better to
# put in a `global.R` file
source_file <- read.csv("source_file.csv")
# Column names we want to show - all except `Date`
opts <- setdiff(colnames(source_file), "Date")
# Update your checkboxGroupInput:
updateCheckboxGroupInput(
session, "portfolio_selection", choices = opts
)
# Rest of app after this point --------------------------------------
}

Creating a function to loop columns through an equation in R

Solution (thanks #Peter_Evan!) in case anyone coming across this question has a similar issue
(Original question is below)
## get all slopes (lm coefficients) first
# list of subfields of interest to loop through
sf <- c("left_presubiculum", "right_presubiculum",
"left_subiculum", "right_subiculum", "left_CA1", "right_CA1",
"left_CA3", "right_CA3", "left_CA4", "right_CA4", "left_GC-ML-DG",
"right_GC-ML-DG")
# dependent variables are sf, independent variable common to all models in the inner lm() call is ICV
# applies the lm(subfield ~ ICV, dataset = DF) to all subfields of interest (sf) specified previously
lm.results <- lapply(sf, function(dv) {
temp.lm <- lm(get(dv) ~ ICV, data = DF)
coef(temp.lm)
})
# returns a list, where each element is a vector of coefficients
# do.call(rbind, ) will paste them together
lm.coef <- data.frame(sf = sf,
do.call(rbind, lm.results))
# tidy up name of intercept variable
names(lm.coef)[2] <- "intercept"
lm.coef
## set up all components for the equation
# matrix to store output
out <- matrix(ncol = length(sf), nrow = NROW(DF))
# name the rows after each subject
row.names(out) <- DF$Subject
# name the columns after each subfield
colnames(out) <- sf
# nested for loop that goes by subject (j) and subfield (i)
for(j in DF$Subject){
for (i in sf) {
slope <- lm.coef[lm.coef$sf == i, "ICV"]
out[j,i] <- as.numeric( DF[DF$Subject == j, i] - (slope * (DF[DF$Subject == j, "ICV"] - mean(DF$ICV))) )
}
}
# check output
out
===============
Original Question:
I have a dataframe (DF) with 13 columns (12 different brain subfields, and one column containing total intracranial volume(ICV)) and 50 rows (each a different participant). I'm trying to automate an equation being looped over every column for each participant.
The data:
structure(list(Subject = c("sub01", "sub02", "sub03", "sub04",
"sub05", "sub06", "sub07", "sub08", "sub09", "sub10", "sub11",
"sub12", "sub13", "sub14", "sub15", "sub16", "sub17", "sub18",
"sub19", "sub20"), ICV = c(1.50813, 1.3964237, 1.6703585, 1.4641886,
1.6351018, 1.5524641, 1.4445532, 1.6384505, 1.6152434, 1.5278011,
1.4788126, 1.4373356, 1.4109637, 1.3634952, 1.3853583, 1.4855268,
1.6082085, 1.5644998, 1.5617522, 1.4304141), left_subiculum = c(411.225013,
456.168033, 492.968477, 466.030173, 533.95505, 476.465524, 448.278213,
476.45566, 422.617374, 498.995121, 450.773906, 461.989663, 549.805272,
452.619547, 457.545623, 451.988333, 475.885847, 490.127968, 470.686415,
494.06548), left_CA1 = c(666.893596, 700.982955, 646.21927, 580.864234,
721.170599, 737.413139, 737.683665, 597.392434, 594.343911, 712.781376,
733.157168, 699.820162, 701.640861, 690.942843, 606.259484, 731.198846,
567.70879, 648.887718, 726.219904, 712.367433), left_presubiculum = c(325.779458,
391.252815, 352.765098, 342.67797, 390.885737, 312.857458, 326.916867,
350.657957, 325.152464, 320.718835, 273.406949, 305.623938, 371.079722,
315.058313, 311.376271, 319.56678, 348.343569, 349.102678, 322.39908,
306.966008), `left_GC-ML-DG` = c(327.037756, 305.63224, 328.945065,
238.920358, 319.494513, 305.153183, 311.347404, 259.259723, 295.369164,
312.022281, 324.200989, 314.636501, 306.550385, 311.399107, 295.108592,
356.197094, 251.098248, 294.76349, 317.308576, 301.800253), left_CA3 = c(275.17038,
220.862237, 232.542718, 170.088695, 234.707172, 210.803287, 246.861975,
171.90896, 220.83478, 236.600832, 246.842024, 239.677362, 186.599097,
224.362411, 229.9142, 293.684776, 172.179779, 202.18936, 232.5666,
221.896625), left_CA4 = c(277.614028, 264.575987, 286.605092,
206.378619, 281.781858, 258.517989, 269.354864, 226.269982, 256.384436,
271.393257, 277.928824, 265.051581, 262.307377, 266.924683, 263.038686,
306.133918, 226.364556, 262.42823, 264.862956, 255.673948), right_subiculum = c(468.762375,
445.35738, 446.536018, 456.73484, 521.041823, 482.768261, 487.2911,
456.39996, 445.392976, 476.146498, 451.775611, 432.740085, 518.170065,
487.642399, 405.564237, 487.188989, 467.854363, 479.268714, 473.212833,
472.325916), right_CA1 = c(712.973011, 717.815214, 663.637105,
649.614586, 711.844375, 779.212704, 862.784416, 648.925038, 648.180611,
760.761704, 805.943016, 717.486756, 801.853608, 722.213109, 621.676321,
791.672796, 605.35667, 637.981476, 719.805053, 722.348921), right_presubiculum = c(327.285242,
364.937865, 288.322641, 348.30058, 341.309111, 279.429847, 333.096795,
342.184296, 364.245998, 350.707173, 280.389853, 276.423658, 339.439377,
321.534798, 302.164685, 328.365751, 341.660085, 305.366589, 320.04127,
303.83284), `right_GC-ML-DG` = c(362.391907, 316.853532, 342.93274,
282.550769, 339.792696, 357.867386, 342.512721, 277.797528, 309.585721,
343.770416, 333.524912, 302.505077, 309.063135, 291.29361, 302.510461,
378.682679, 255.061044, 302.545288, 313.93902, 297.167161), right_CA3 = c(307.007404,
243.839349, 269.063801, 211.336979, 249.283479, 276.092623, 268.183349,
202.947849, 214.642782, 247.844657, 291.206598, 235.864996, 222.285729,
201.427853, 237.654913, 321.338801, 199.035108, 243.204203, 236.305659,
213.386702), right_CA4 = c(312.164065, 272.905586, 297.99392,
240.765062, 289.98697, 306.459566, 284.533068, 245.965817, 264.750571,
296.149675, 290.66935, 264.821461, 264.920869, 246.267976, 266.07378,
314.205819, 229.738951, 274.152503, 256.414608, 249.162404)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
The equation:
adjustedBrain(participant1) = rawBrain(participant1) - slope*[ICV(participant1) - (mean of all ICV measures included in the calculation of the slope)]
The code (which is not working and I was hoping for some pointers):
adjusted_Brain <- function(DF, subject) {
subfields <- colnames(select(DF, "left_presubiculum", "right_presubiculum",
"left_subiculum", "right_subiculum", "left_CA1", "right_CA1",
"left_CA3", "right_CA3", "left_CA4", "right_CA4", "left_GC-ML-DG",
"right_GC-ML-DG"))
out <- matrix(ncol = length(subfields), nrow = NROW(DF))
for (i in seq_along(subfields)) {
DF[i] = DF[DF$Subject == "subject", "i"] -
slope * (DF[DF$Subject == "subject", "ICV"] -
mean(DF$ICV))
}
}
Getting this error:
Error: Can't subset columns that don't exist.
x Column `i` doesn't exist.
A few notes:
The slopes for each subject for each subfield will be different (and will come from a regression) -> is there a way to specify that in the function so the slope (coefficient from the appropriate regression equation) gets called in?
I have my nrow set to the number of participants right now in the output because I'd like to have this run through EVERY subject across EVERY subfield and spit out a matrix with all the adjusted brain volumes... But that seems very complicated and so for now I will just settle for running each participant separately.
Any help is greatly appreciated!
As others have noted in the comments, there are quite a few syntax issues that prevent your code from running, as well as a few unstated requirements. That aside, I think there is enough to recommend a few improvements that you can hopefully build on. Here are the top line changes:
You likely don't need this to be a function, but rather a nested for loop (if you want to do this with base R). As written, the code isn't flexible enough to merit a function. If you intend to apply this many times across different datasets, a function might make sense. However, it will require a much larger rewrite.
Assuming you are fitting a simple regression via lm, then you can pull out the coefficient of interest via the $ operator and indexing (see below). Some thought will need to go into how to handle different models in the loop. Here, we assume you only need one coefficient from one model.
There are a few areas where the syntax is incorrect and a review of sub setting in base R would be helpful. Others have pointed out in the comments were some of these are.
Here is one approach were we loop through each subject (j) through each feature or subfield (i) and store them in a matrix (out). This is just an approach and will almost certainly need tweaking on your end!
#NOTE: the dataset your provided is saved as x in this example.
#fit a linear model - here we assume there is only one coef. of interest, but you may need to alter
# depending on how the slope changes in each calculation
reg <- lm(ICV ~ right_CA3, x)
# view the coeff.
reg$coefficients
# pull out the slope by getting the coeff. of interest (via index) from the reg object
slope <- reg$coefficients[[1]]
# list of features/subfeilds to loop through
sf <- c("left_presubiculum", "right_presubiculum",
"left_subiculum", "right_subiculum", "left_CA1", "right_CA1",
"left_CA3", "right_CA3", "left_CA4", "right_CA4", "left_GC-ML-DG",
"right_GC-ML-DG")
# matrix to store output
out <- matrix(ncol = length(sf), nrow = NROW(x))
#name the rows after each subject
row.names(out) <- x$Subject
#name the columns after each sub feild
colnames(out) <- sf
# nested for loop that goes by subject (j) and features/subfeilds (i)
for(j in x$Subject){
for (i in sf) {
out[j,i] <- as.numeric( x[x$Subject == j, i] - (slope * (x[x$Subject == j, "ICV"] - mean(x$ICV))) )
}
}
# check output
out

Using cpquery function for several pairs from dataset

I am relatively beginner in R and trying to figure out how to use cpquery function for bnlearn package for all edges of DAG.
First of all, I created a bn object, a network of bn and a table with all strengths.
library(bnlearn)
data(learning.test)
baynet = hc(learning.test)
fit = bn.fit(baynet, learning.test)
sttbl = arc.strength(x = baynet, data = learning.test)
Then I tried to create a new variable in sttbl dataset, which was the result of cpquery function.
sttbl = sttbl %>% mutate(prob = NA) %>% arrange(strength)
sttbl[1,4] = cpquery(fit, `A` == 1, `D` == 1)
It looks pretty good (especially on bigger data), but when I am trying to automate this process somehow, I am struggling with errors, such as:
Error in sampling(fitted = fitted, event = event, evidence = evidence, :
logical vector for evidence is of length 1 instead of 10000.
In perfect situation, I need to create a function that fills the prob generated variable of sttbl dataset regardless it's size. I tried to do it with for loop to, but stumbled over the error above again and again. Unfortunately, I am deleting failed attempts, but they were smt like this:
for (i in 1:nrow(sttbl)) {
j = sttbl[i,1]
k = sttbl[i,2]
sttbl[i,4]=cpquery(fit, fit$j %in% sttbl[i,1]==1, fit$k %in% sttbl[i,2]==1)
}
or this:
for (i in 1:nrow(sttbl)) {
sttbl[i,4]=cpquery(fit, sttbl[i,1] == 1, sttbl[i,2] == 1)
}
Now I think I misunderstood something in R or bnlearn package.
Could you please tell me how to realize this task with filling the column by multiple cpqueries? That would help me a lot with my research!
cpquery is quite difficult to work with programmatically. If you look at the examples in the help page you can see the author uses eval(parse(...)) to build the queries. I have added two approaches below, one using the methods from the help page and one using cpdist to draw samples and reweighting to get the probabilities.
Your example
library(bnlearn); library(dplyr)
data(learning.test)
baynet = hc(learning.test)
fit = bn.fit(baynet, learning.test)
sttbl = arc.strength(x = baynet, data = learning.test)
sttbl = sttbl %>% mutate(prob = NA) %>% arrange(strength)
This uses cpquery and the much maligned eval(parse(...)) -- this is the
approach the the bnlearn author takes to do this programmatically in the ?cpquery examples. Anyway,
# You want the evidence and event to be the same; in your question it is `1`
# but for example using learning.test data we use 'a'
state = "\'a\'" # note if the states are character then these need to be quoted
event = paste(sttbl$from, "==", state)
evidence = paste(sttbl$to, "==", state)
# loop through using code similar to that found in `cpquery`
set.seed(1) # to make sampling reproducible
for(i in 1:nrow(sttbl)) {
qtxt = paste("cpquery(fit, ", event[i], ", ", evidence[i], ",n=1e6", ")")
sttbl$prob[i] = eval(parse(text=qtxt))
}
I find it preferable to work with cpdist which is used to generate random samples conditional on some evidence. You can then use these samples to build up queries. If you use likelihood weighting (method="lw") it is slightly easier to do this programatically (and without evil(parse(...))).
The evidence is added in a named list i.e. list(A='a').
# The following just gives a quick way to assign the same
# evidence state to all the evidence nodes.
evidence = setNames(replicate(nrow(sttbl), "a", simplify = FALSE), sttbl$to)
# Now loop though the queries
# As we are using likelihood weighting we need to reweight to get the probabilities
# (cpquery does this under the hood)
# Also note with this method that you could simulate from more than
# one variable (event) at a time if the evidence was the same.
for(i in 1:nrow(sttbl)) {
temp = cpdist(fit, sttbl$from[i], evidence[i], method="lw")
w = attr(temp, "weights")
sttbl$prob2[i] = sum(w[temp=='a'])/ sum(w)
}
sttbl
# from to strength prob prob2
# 1 A D -1938.9499 0.6186238 0.6233387
# 2 A B -1153.8796 0.6050552 0.6133448
# 3 C D -823.7605 0.7027782 0.7067417
# 4 B E -720.8266 0.7332107 0.7328657
# 5 F E -549.2300 0.5850828 0.5895373

How to continuously send data from LabVIEW to R? (code help)

I am trying to bring real time data from LabVIEW (vibration of a bearing and temperature) into an app written in R to create a control chart. It works for a while but eventually crashes with the following error message:
Error in aggregate.data.frame(B, list(rep(1:(nrow(B)%/%n + 1), each = n, :
no rows to aggregate
The process works as LabVIEW takes the data and projects it onto two Excel files. Those files are read in the R code and used to project a control chart in R. The process succeeds for some time, and the failure moment is not always the same amount of time. Sometimes the control chart will run for 6-7 min, other times is will crash in 2 min.
My suspicion is that the Excel files are not being updated fast enough, so the R code tries to read that Excel file when it is empty.
Any suggestions would be great! thank you!
I have tried to lower the sample size taken per second. That did not work.
getwd()
setwd("C:/Users/johnd/Desktop/R Data")
while(1) {
A = fread("C:/Users/johnd/Desktop/R Data/a1.csv" , skip = 4 , header = FALSE , col.names = c("t1","B2","t2","AM","t3","M","t4","B1"))
t1 = A$t1
B2 = A$B2
t2 = A$t2
AM = A$AM
t3 = A$t3
M = A$M
t4 = A$t4
B1 = A$B1
B = fread("C:/Users/johnd/Desktop/R Data/b1.csv" , skip = 4 , header = FALSE , col.names = c("T1","small","T2","big"))
T1 = B$T1
small = B$small
T2 = B$T2
big = B$big
DJ1 = A[seq(1,nrow(A),1),c('t1','B2','AM','M','B1')]
DJ1
n = 16
DJ2 = aggregate(B,list(rep(1:(nrow(B)%/%n+1),each=n,len=nrow(B))),mean)[-1]
DJ2
#------------------------------------------------------------------------
DJ6 = cbind(DJ1[,'B1'],DJ2[,c('small','big')]) # creates matrix for these three indicators
DJ6
#--------------T2 Hand made---------------------------------------------------------------------
new_B1 = DJ6[,'B1']
new_small = DJ6[,'small'] ### decompose the DJ6 matrix into vectors for each indicator(temperature, big & small accelerometers)
new_big = DJ6[,'big']
new_B1
new_small
new_big
mean_B1 = as.numeric(colMeans(DJ6[,'B1']))
mean_small = as.numeric(colMeans(DJ6[,'small'])) ##decomposes into vectors of type numeric
mean_big = as.numeric(colMeans(DJ6[,'big']))
cov_inv = data.matrix(solve(cov(DJ6))) # obtain inverse covariance matrix
cov_inv
p = ncol(DJ6) #changed to pull number of parameters by taking the number of coumns in OG matrix #p=3 # #ofQuality Characteristics
m=64 # #of samples (10 seconds of data)
a_alpha = 0.99
f= qf(a_alpha , df1 = p,df2 = (m-p)) ### calculates the F-Statistic for our data
f
UCL = (p*(m+1)*(m-1)*(f))/(m*(m-p)) ###produces upper control limit
UCL
diff_B1 = new_B1-mean_B1
diff_small = new_small-mean_small
diff_big = new_big-mean_big
DJ7 = cbind(diff_B1, diff_small , diff_big) #produces matrix of difference between average and observations (x-(x-bar))
DJ7
# DJ8 = data.matrix(DJ7[1,])
# DJ8
DJ9 = data.matrix(DJ7) ### turns matrix into appropriate numeric form
DJ9
# T2.1.1 = DJ8 %*% cov_inv %*% t(DJ8)
# T2.1.1
# T2.1 = t(as.matrix(DJ9[1,])) %*% cov_inv %*% as.matrix(DJ9[1,])
# T2.1
#T2 <- NULL
for(i in 1:64){ #### creates vector of T^2 statistic
T2<- t(as.matrix(DJ9[i,])) %*% cov_inv %*% as.matrix(DJ9[i,]) # calculation of T^2 test statistic ## there is no calculation of x-double bar
write.table(T2,"C:/Users/johnd/Desktop/R Data/c1.csv",append=T,sep="," , col.names = FALSE)#
#
DJ12 <-fread("C:/Users/johnd/Desktop/R Data/c1.csv" , header = FALSE ) #
}
# DJ12
DJ12$V1 = 1:nrow(DJ12)
# plot(DJ12 , type='l')
p1 = nrow(DJ12)-m
p2 = nrow(DJ12)
plot(DJ12[p1:p2,], type ='o', ylim =c(0,15), ylab="T2 Chart" , xlab="Data points") ### plots last 640 points
# plot(DJ12[p1:p2,], type ='o' , ylim =c(0,15) , ylab="T2 Chart" , xlab="Data points")
abline(h=UCL , col="red") ## displays upper control limit
Sys.sleep(1)
}
The process succeeds for some time, and the failure moment is not always the same amount of time. Sometimes the control chart will run for 6-7 min, other times is will crash in 2 min.
My suspicion is that the Excel files are not being updated fast enough, so the R code tries to read that Excel file when it is empty.
Your suspicion is correct.
With your current design, your R application can crash depending on how fast it runs relative to your LabVIEW application. This is called a race condition; you must eliminate race conditions from your code.
A quick and dirty solution
One simple solution to avoid the crash is to call NROW to check if any data exists. If there's no data available, don't call aggregate. This is described here: error message in r: no rows to aggregate
A more robust solution
A better solution is to use a communications protocol like TCP to stream data from LabVIEW to R, instead of using CSV files to transfer real-time data. For example, your R program could listen for data on a TCP socket. Make it wait for data to be sent from LabVIEW before running your data processing code.
Here is an example on using socketConnection in R: http://blog.corynissen.com/2013/05/using-r-to-communicate-via-socket.html
Here is an example on sending/receiving data over TCP in LabVIEW: http://www.ni.com/product-documentation/2710/en/

Resources