I am a newbie with R, and would like to understand what it can do for control charting. I have read articles on qcc and created sample charts in R studio based on my own datasets to generate graphics or simply the underlying data.
It appears that two out of the shewhart control/run rules are implemented in QCC (+/- 3 sigma and a string above/below center), but more have been defined and are frequently used in practice. e.g. Nelson rules
Is there an R library/function that implements these? In addition to implementing the rules, I want to support the option to specify the "constant" related to the rule. For example where the referenced article says "Eight points in a row.." I would like eight to be a parameter. I'm thinking that the $data output from the qcc command could be passed as an argument (along with vector of rule "constant" options), and in return would be a list of violation points and rule number violated.
Any thoughts / suggestions?
We're working on the implementation of Nelson Rules in R. I think this is exactly what you're looking for (happy to share, I couldn't find an R implementation anywhere else on the internet):
nelsonr1 <- function(x, m = mean(x), s = sd(x)) {
# Nelson's QC rule 1: detect values outside + or -3 sd
which(abs((x - m) / s) >= 3)
}
nelsonr2 <- function(x, m = mean(x), minrun = 9) {
# Nelson's QC rule 2: detect runs of >= 9 points on the same side of the mean
n <- length(x)
counts <- sign(x - m)
result <- counts
for (runlength in 2:minrun)
result <- result + c(counts[runlength:n], rep(0, runlength - 1))
which(abs(result) >= minrun)
}
nelsonr3 <- function(x, minrun = 6) {
# Nelson's QC rule 3: detect strict increase or decrease in >= 6 points in a row
# Between 6 points you have 5 instances of increasing or decreasing. Therefore minrun - 1.
n <- length(x)
signs <- sign(c(x[-1], x[n]) - x)
counts <- signs
for (rl in 2:(minrun - 1)) {
counts <- counts + c(signs[rl:n], rep(0, rl - 1))
}
which(abs(counts) >= minrun - 1)
}
nelsonr4 <- function(x, m = mean(x), minrun = 14, directing_from_mean = FALSE) {
# Nelson's QC rule 4: 14 points in a row alternating in direction from the mean,
# or 14 points in a row alternating in increase and decrease
n <- length(x)
if (directing_from_mean == TRUE) {
signs <- sign(x - m)
} else {
signs <- sign(c(x[-1],x[n]) - x)
}
counts <- signs
fac <- -1
for (rl in 2:minrun) {
counts <- counts + fac * c(signs[rl:n], rep(0, rl - 1))
fac <- -fac
}
counts <- abs(counts)
which(counts >= minrun)
}
nelsonr5 <- function(x, m = mean(x), s = sd(x), minrun = 3) {
# Nelson's QC rule 5: two out of 3 >2 sd from mean in the same direction
n <- length(x)
pos <- 1 * ((x - m) / s > 2)
neg <- 1 * ((x - m) / s < -2)
poscounts <- pos
negcounts <- neg
for (rl in 2:minrun) {
poscounts <- poscounts + c(pos[rl:n], rep(0, rl - 1))
negcounts <- negcounts + c(neg[rl:n], rep(0, rl - 1))
}
counts <- apply(cbind(poscounts, negcounts), 1, max)
which(counts >= minrun -1)
}
nelsonr6 <- function(x, m = mean(x), s = sd(x), minrun = 5) {
# Nelson's QC rule 6: four out of five > 1 sd from mean in the same direction
n <- length(x)
pos <- 1 * ((x - m) / s > 1)
neg <- 1 * ((x - m) / s < -1)
poscounts <- pos
negcounts <- neg
for (rl in 2:minrun) {
poscounts <- poscounts + c(pos[rl:n], rep(0, rl - 1))
negcounts <- negcounts + c(neg[rl:n], rep(0, rl - 1))
}
counts <- apply(cbind(poscounts, negcounts), 1, max)
which(counts >= minrun - 1)
}
nelsonr7 <- function(x, m = mean(x), s = sd(x), minrun = 15) {
# Nelson's QC rule 7: >= 15 points in a row within 1 sd from the mean
n <- length(x)
within <- 1 * (abs((x - m) / s) < 1)
counts <- within
for (rl in 2:minrun)
counts <- counts + c(within[rl:n], rep(0, rl - 1))
which(counts >= minrun)
}
nelsonr8 <- function(x, m = mean(x), s = sd(x), minrun = 8) {
# Nelson's QC rule 8: >= 8 points in a row all outside the m + -1s range
n <- length(x)
outofrange <- 1 * (abs((x - m) / s) > 1)
counts <- outofrange
for (rl in 2:minrun)
counts <- counts + c(outofrange[rl:n], rep(0, rl - 1))
which(counts >= minrun)
}
For example where the referenced article says "Eight points in a row.." I would like eight to be a parameter.
That's what this does too with the parameter minrun in some functions.
Related
developers!
I have encountered an error message
Error in if (obs <= ei) 2 * pv else 2 * (1 - pv) : missing value where
TRUE/FALSE needed
stopping me to get the value from Moran's I function from ape package. Here is what I did:
library(ape)
nrstp <- data.frame(
X = c(300226.9, 300224.6, 300226.4, 300226.1, 300224.0, 300226.4, 300225.7, 300226.4, 300226.1, 300226.4, 300226.3, 300226.3, 300227.1),
Y = c(5057949, 5057952, 5057950, 5057950, 5057956, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057949),
V3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0))
nrstp = data.frame(nrstp)
dist = as.matrix(dist(cbind(nrstp$X, nrstp$Y)))
invdist = 1/dist
invdist[is.infinite(invdist)] <- 0
moranI = Moran.I(nrstp$V3, invdist)
The intention of this code is to calculate Moran's I from a series of point to check spatial autocorrelation. So far, this seems to be the only function working for Moran's I in R. After a few testing (I have thousands groups of points), this error seems only happen to the input vector having only one value (I tried other numbers than 0, it still raise this error).
Could someone help me improve this code? Or are their better suggestion to calculate Moran's I or test spatial autocorrelation from linestring (those point groups are origin point of one linestring and the closest points from other linestring within 10 meter buffer of such origin point)?
Thank you ahead for any help!
The control flow choice if(condition) do something requires that the value of condition is not NA.
In your case, obs <= ei results in NA. That's why the the error message missing value where TRUE/FALSE needed is generated.
To understand how obs <= ei results in NA, you can check the details inside Moran.I function:
Moran.I
function (x, weight, scaled = FALSE, na.rm = FALSE, alternative = "two.sided")
{
if (dim(weight)[1] != dim(weight)[2])
stop("'weight' must be a square matrix")
n <- length(x)
if (dim(weight)[1] != n)
stop("'weight' must have as many rows as observations in 'x'")
ei <- -1/(n - 1)
nas <- is.na(x)
if (any(nas)) {
if (na.rm) {
x <- x[!nas]
n <- length(x)
weight <- weight[!nas, !nas]
}
else {
warning("'x' has missing values: maybe you wanted to set na.rm = TRUE?")
return(list(observed = NA, expected = ei, sd = NA,
p.value = NA))
}
}
ROWSUM <- rowSums(weight)
ROWSUM[ROWSUM == 0] <- 1
weight <- weight/ROWSUM
s <- sum(weight)
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if (scaled) {
i.max <- (n/s) * (sd(rowSums(weight) * y)/sqrt(v/(n -
1)))
obs <- obs/i.max
}
S1 <- 0.5 * sum((weight + t(weight))^2)
S2 <- sum((apply(weight, 1, sum) + apply(weight, 2, sum))^2)
s.sq <- s^2
k <- (sum(y^4)/n)/(v/n)^2
sdi <- sqrt((n * ((n^2 - 3 * n + 3) * S1 - n * S2 + 3 * s.sq) -
k * (n * (n - 1) * S1 - 2 * n * S2 + 6 * s.sq))/((n -
1) * (n - 2) * (n - 3) * s.sq) - 1/((n - 1)^2))
alternative <- match.arg(alternative, c("two.sided",
"less", "greater"))
pv <- pnorm(obs, mean = ei, sd = sdi)
if (alternative == "two.sided")
pv <- if (obs <= ei)
2 * pv
else 2 * (1 - pv)
if (alternative == "greater")
pv <- 1 - pv
list(observed = obs, expected = ei, sd = sdi, p.value = pv)
}
<bytecode: 0x000001cd5e0715d0>
<environment: namespace:ape>
By assigning x = nrstp$V3 and weight = invdist, you will get mean(x) = 0. This results in y=0, cv = 0, v=0, and finally obs = NaN. Consequently,
obs <= ei
[1] NA
To overcome the problem, you need to ensure that each of obs and ei is not NA. In your case, if mean(x) is not zero, obs <= ei will not be NA. However, because I know nothing about this particular topic, I'm not sure whether non-zero mean(x) is always the right solution.
The problem is that your x are all the same value. If you look in the code from Abdur Rohman the calculation of the function is
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if all x are the same than the mean of m <- mean(x) is obviously the same value as all x and y, v, obs are 0.
For obs you divide cv/v which is NaN
So at least one value of x should be different
I am trying to calculate the distance between warehouse locations in rStudio utilizing an algorithm from an academic paper. The formula accounts for dimensions of location width, depth, and side of the aisle. The added complexity comes in when calculating the shortest distance with multiple cross aisles. This is all based on this paper.
This is from a bird's eye view:
I have static values for α = 48, ß = 96, ϒ = 108, Ω = 75, S = 22.
I then have a data frame with i as the key for location number, X for aisle number, y for section number, z for side number, and Cross Aisle is a boolean (0 = not a cross-aisle, 1 = is a cross-aisle). Here is a sample:
i X Y Z Cross Aisle
1 1 1 1 0
2 1 2 1 0
....
357 12 20 2 0
These are the formulas between locations i and j if the warehouse had no cross-aisles and was one rectangular grid:
Items in the same aisle (xi = xj):
dij = |yi - yj| * ß + |zi - zj| * ϒ
If items are in different aisles, there are three different scenarios:
dij = {|xi - xj| * (2α + ϒ) + v} if zi = zj
dij = {(xj - xi) * (2α + ϒ) + ϒ + v} if zi = 1, zj = 2
dij = {(xj - xi) * (2α + ϒ) - ϒ + v} if zi = 2, zj = 1
where v is the "vertical" distance (bird's eye, up-down aisle):
v = min(ß * (2 * S - yi - yj), ß * (yi + yj)) + 2Ω
(*Note: the academic paper has a typo in the v formula. It states 2 - yi - yj in the first bit, but I found another, original source that had it correctly as 2 * S-yi - yj.)
This piece of the formula is not the tricky part. It is a fairly simple if/then function in R to compute. However, this next bit with the cross-aisle gets me.
This is from the academic paper:
The authors state essentially: There are two locations p1 and p2. There are two neighboring cross-aisles, a1 and a2. One is above p1 and the other is below p1. Also, cross-aisles b1 and b2 are found, which are neighboring p2 and lead left. The distance between p1 and p2 are as follows:
d(p1,p2) = min{d(p1,ai) + d(ai,bj) + d(bj,p2),i,j ∈ {1,2}}
I am unsure how to apply this algorithm to my data set and construct the necessary loops, and matrix to find the distances between my warehouse locations. I could really use some help here making sense of it.
Here is my actual data set.
Here is an image of my warehouse to give you a sense of the layout. The "X" locations are cross-aisles.
I was able to get a workable loop without the cross-aisles:
whse.data <- read.xlsx("data set.xlsx", sheet = 1) %>%
as.data.frame()
### CREATE COMBINATION OF LOCATIONS
require(tools)
cmbn.loc <- combinations(n = max(whse.data$i), r = 2, v = whse.data$i,
repeats.allowed = FALSE) %>%
as.data.frame()
### CALCULATE DISTANCE BETWEEN LOCATIONS
LocDst <- function(dc, wc, wa, tr, s, df, comb){
# Creates a distance between various locations
#
# Args:
# dc: Depth of cell (alpha)
# wc: Width of cell (beta)
# wa: Width of aisle (y)
# tr: turning radius (omega)
# s: number of sections (S)
# df: Data Frame with location i, x, y, z, Cross.Aisle data
# comb: Combination of locations to compare
#
# Returns:
# Data frame with distances between each location combination
#
dist.df_total <- data.frame()
for (n in 1:nrow(comb)){
i <- comb[n,1]
j <- comb[n,2]
xi <- df[df$i == i,2]
yi <- df[df$i == i,3]
zi <- df[df$i == i,4]
xj <- df[df$i == j,2]
yj <- df[df$i == j,3]
zj <- df[df$i == j,4]
v <- min(wc * (2 * s - yi - yj), wc * (yi + yj)) + 2 * tr
if(xi == xj){
dij <- abs(yi - yj) * wc + abs(zi - zj) * wa
} else if (zi == zj){
dij <- (abs(xi - xj) * (2 * dc + wa) + v)
} else if (zi == 1 & zj == 2){
dij <- ((xj - xi) * (2 * dc + wa) + wa + v)
} else {
dij <- ((xj - xi) * (2 * dc * wa) - wa + v)
}
dist.df <- data.frame(`i` = i, `j` = j, dist = dij)
dist.df_total <- rbind.data.frame(dist.df_total, dist.df)
}
return(dist.df_total)
}
dist <- LocDst(48, 96, 108, 75, 18, whse.data, cmbn.loc)
I need a workable for loop or something to be run Algorithm 1 above, please.
I was able to get something to work. If anyone has anything more straightforward, I am all ears. Maybe this will be helpful to someone out there!
I had to use Excel to calculate the distance between the cross-aisles. There's probably a code for it, but it wasn't value-add for me at this time. Here's a sample of that data:
V1 V2 Dist
7 18 672
7 19 780
7 33 204
....
341 342 108
where V1 represents the first location number and V2 the second for all cross-aisle combinations.
Everything else should be computed within the code (beyond what put above):
require(dplyr)
require(openxlsx)
require(tools)
whse.data <- read.xlsx("data set.xlsx", sheet = 1) %>%
as.data.frame()
### CREATE COMBINATION OF LOCATIONS
cmbn.loc <- combinations(n = max(whse.data$i), r = 2, v = whse.data$i,
repeats.allowed = FALSE) %>%
as.data.frame()
# CROSS-AISLES IN EACH SHELF
ca.shelf <- cross.aisles %>%
group_by(Shelf) %>%
summarise(No.Cross.Aisles = sum(Cross.Aisle)) %>%
as.data.frame()
# DISTANCE BETWEEN CROSS AISLES
cmbn.cross.aisle <- combinations(n = nrow(cross.aisles),
r = 2,
v = cross.aisles$i,
repeats.allowed = FALSE) %>%
as.data.frame()
dist.cross.aisle <- read.xlsx("Combination of Cross-Aisles v3.xlsx", sheet = 1) %>%
as.data.frame()
# CROSS AISLE FUNCTION
CrsAisDst <- function(dc, wc, wa, tr, s, no.sh, df, comb, ca.m, d.m){
# Creates a distance between various locations
#
# Args:
# dc: Depth of cell (alpha)
# wc: Width of cell (beta)
# wa: Width of aisle (y)
# tr: turning radius (omega)
# s: number of sections (S)
# no.sh: number of shelves
# df: Data Frame with location i, x, y, z, Cross.Aisle data
# comb: Combination of locations to compare
# ca.m: Cross-aisles matrix
# d.m: Distances between cross-aisles
#
# Returns:
# Data frame with distances between each location combination
#
dist.df_total <- data.frame()
for (n in 1:nrow(comb)){
i <- comb[n,1]
j <- comb[n,2]
xi <- df[df$i == i,2]
yi <- df[df$i == i,3]
zi <- df[df$i == i,4]
xj <- df[df$i == j,2]
yj <- df[df$i == j,3]
zj <- df[df$i == j,4]
v <- min(wc * (2 * s - yi - yj), wc * (yi + yj)) + 2 * tr
if(xi == xj){
min.dij <- abs(yi - yj) * wc + abs(zi - zj) * wa
} else {
shi <- df[df$i == i,6]
shj <- df[df$i == j,6]
### CROSS-AISLES
ca.i <- #ca.m[ca.m$Shelf == shi,1]
data.frame(`i` = ca.m[ca.m$Shelf == shi,1])
ca.j <- #ca.m[ca.m$Shelf == shj,1]
data.frame(`j` = ca.m[ca.m$Shelf == shj,1])
## JOIN DISTANCES
dist.df_total.i <- data.frame()
dist.df_total.j <- data.frame()
#
for (m in 1:nrow(ca.i)){
i.i <- i
j.i <- ca.i[m,]
xi.i <- df[df$i == i.i,2]
yi.i <- df[df$i == i.i,3]
zi.i <- df[df$i == i.i,4]
xj.i <- df[df$i == j.i,2]
yj.i <- df[df$i == j.i,3]
zj.i <- df[df$i == j.i,4]
dij.i <- abs(yi.i - yj.i) * wc + abs(zi.i - zj.i) * wa
dist.df.i <- data.frame(`i` = i.i, `j` = j.i, dist = dij.i)
dist.df_total.i <- rbind.data.frame(dist.df_total.i, dist.df.i)
}
for (l in 1:nrow(ca.j)){
i.j <- j
j.j <- ca.j[l,]
xi.j <- df[df$i == i.j,2]
yi.j <- df[df$i == i.j,3]
zi.j <- df[df$i == i.j,4]
xj.j <- df[df$i == j.j,2]
yj.j <- df[df$i == j.j,3]
zj.j <- df[df$i == j.j,4]
dij.j <- abs(yi.j - yj.j) * wc + abs(zi.j - zj.j) * wa
dist.df.j <- data.frame(`i` = i.j, `j` = j.j, dist = dij.j)
dist.df_total.j <- rbind.data.frame(dist.df_total.j, dist.df.j)
}
min.i <- dist.df_total.i %>% slice(which.min(dist))
min.j <- dist.df_total.j %>% slice(which.min(dist))
aisle <- data.frame(V1=min.i$j,V2=min.j$j)
dist.aisle <- semi_join(d.m, aisle, by = c("V1", "V2"))
# CALCULATING DISTANCE WITHOUT CROSS-AISLES
if (zi == zj){
dij <- (abs(xi - xj) * (2 * dc + wa) + v)
} else if (zi == 1 & zj == 2){
dij <- ((xj - xi) * (2 * dc + wa) + wa + v)
} else {
dij <- ((xj - xi) * (2 * dc * wa) - wa + v)
}
min.dij <- min(dij, (min.i$dist + min.j$dist + dist.aisle$Dist))
}
dist.df <- data.frame(`i` = i, `j` = j, dist = min.dij)
dist.df_total <- rbind.data.frame(dist.df_total, dist.df)
}
return(dist.df_total)
}
aisle.dist <- CrsAisDst(48, 96, 108, 75, 18, 23, whse.data, cmbn.loc, cross.aisles,
dist.cross.aisle)
Output looks like this:
i j dist
7 18 672
7 19 780
7 33 204
....
341 342 108
(Note: this last same I ran was just among cross-aisles, which is why the numbers look the same. I have tested it, though, and it will use the regular formula if it is less distance.)
so basically lets say I have a function X that will calculate the random motion of a particle in 1 dimension. The function has different constants, and a normal random variable W, every path happens every 0.1ms. I want to simulate 100 paths.
X <- 0;
Dt <- 0.0001;
V <- 0.5;
for (j in 0:100){
W <- rnorm(100, j*Dt*V,1);
x[0] = 0;
x[j] = x[j-1] + Dt*V+ W*sqrt(Dt)
}
But I get an error saying that "replacement has zero length", also after getting the arrya of the different positions of the particle I would like to simulate it but I am not sure on how to do this.
Thank you
X <- array()
Dt <- 0.0001
V <- 0.5
X[1] = 0
for (j in 2:101){
W <- rnorm(100, j*Dt*V,1)
X[j] = X[j-1] + Dt*V+ W*sqrt(Dt)
}
I believe you are trying to do something like this:
X <- 0;
Dt <- 0.0001;
V <- 0.5;
LEN <- 101
W <- rnorm(LEN - 1, Dt * V, 1)
x <- rep(0, LEN)
for (i in seq_len(LEN - 1)) {
x[i + 1] = x[i] + Dt * V + W[i] * sqrt(Dt)
}
x
I tried running the code below.
set.seed(307)
y<- rnorm(200)
h2=0.3773427
t=seq(-3.317670, 2.963407, length.out=500)
fit=density(y, bw=h2, n=1024, kernel="epanechnikov")
integrate.xy(fit$x, fit$y, min(fit$x), t[407])
However, i recived the following message:
"Error in seq.default(a, length = max(0, b - a - 1)) :
length must be non-negative number"
I am not sure what's wrong.
I do not encounter any problem when i use t[406] or t[408] as follow:
integrate.xy(fit$x, fit$y, min(fit$x), t[406])
integrate.xy(fit$x, fit$y, min(fit$x), t[408])
Does anyone know what's the problem and how to fix it? Appreciate your help please. Thanks!
I went through the source code for the integrate.xy function, and there seems to be a bug relating to the usage of the xtol argument.
For reference, here is the source code of integrate.xy function:
function (x, fx, a, b, use.spline = TRUE, xtol = 2e-08)
{
dig <- round(-log10(xtol))
f.match <- function(x, table) match(signif(x, dig), signif(table,
dig))
if (is.list(x)) {
fx <- x$y
x <- x$x
if (length(x) == 0)
stop("list 'x' has no valid $x component")
}
if ((n <- length(x)) != length(fx))
stop("'fx' must have same length as 'x'")
if (is.unsorted(x)) {
i <- sort.list(x)
x <- x[i]
fx <- fx[i]
}
if (any(i <- duplicated(x))) {
n <- length(x <- x[!i])
fx <- fx[!i]
}
if (any(diff(x) == 0))
stop("bug in 'duplicated()' killed me: have still multiple x[]!")
if (missing(a))
a <- x[1]
else if (any(a < x[1]))
stop("'a' must NOT be smaller than min(x)")
if (missing(b))
b <- x[n]
else if (any(b > x[n]))
stop("'b' must NOT be larger than max(x)")
if (length(a) != 1 && length(b) != 1 && length(a) != length(b))
stop("'a' and 'b' must have length 1 or same length !")
else {
k <- max(length(a), length(b))
if (any(b < a))
stop("'b' must be elementwise >= 'a'")
}
if (use.spline) {
xy <- spline(x, fx, n = max(1024, 3 * n))
if (xy$x[length(xy$x)] < x[n]) {
if (TRUE)
cat("working around spline(.) BUG --- hmm, really?\n\n")
xy$x <- c(xy$x, x[n])
xy$y <- c(xy$y, fx[n])
}
x <- xy$x
fx <- xy$y
n <- length(x)
}
ab <- unique(c(a, b))
xtol <- xtol * max(b - a)
BB <- abs(outer(x, ab, "-")) < xtol
if (any(j <- 0 == apply(BB, 2, sum))) {
y <- approx(x, fx, xout = ab[j])$y
x <- c(ab[j], x)
i <- sort.list(x)
x <- x[i]
fx <- c(y, fx)[i]
n <- length(x)
}
ai <- rep(f.match(a, x), length = k)
bi <- rep(f.match(b, x), length = k)
dfx <- fx[-c(1, n)] * diff(x, lag = 2)
r <- numeric(k)
for (i in 1:k) {
a <- ai[i]
b <- bi[i]
r[i] <- (x[a + 1] - x[a]) * fx[a] + (x[b] - x[b - 1]) *
fx[b] + sum(dfx[seq(a, length = max(0, b - a - 1))])
}
r/2
}
The value given to the xtol argument, is being overwritten in the line xtol <- xtol * max(b - a). But the value of the dig variable is calculated based on the original value of xtol, as given in the input to the function. Because of this mismatch, f.match function, in the line bi <- rep(f.match(b, x), length = k), returns no matches between x and b (i.e., NA). This results in the error that you have encountered.
A simple fix, at least for the case in question, would be to remove the xtol <- xtol * max(b - a) line. But, you should file a bug report with the maintainer of this package, for a more rigorous fix.
I'm pretty new to R, and am struggling a bit with it. I have the following code:
repeat {
if (t > 1000)
break
else {
y1 <- rpois(50, 15)
y2 <- rpois(50, 15)
y <- c(y1, y2)
p_0y <- matrix(nrow = max(y) - min(y), ncol = 1)
i = min(y)
while (i <= max(y)) {
p_0y[i - min(y), ] = (length(which(y1 == i))/50)
i <- i + 1
}
p_y <- matrix(nrow = max(y) - min(y), ncol = 1)
j = min(y)
while (j <= max(y)) {
p_y[j - min(y), ] = (length(which(y == j))/100)
j <- j + 1
}
p_0yx <- p_0y[rowSums(p_0y == 0) == 0]
p_yx <- p_y[rowSums(p_0y == 0) == 0]
g = 0
logvect <- matrix(nrow = (length(p_yx)), ncol = 1)
while (g <= (length(p_yx))) {
logvect[g, ] = (p_0yx[g])/(p_yx[g])
g <- g + 1
}
p_0yx %*% (log2(logvect))
print(p_0yx %*% (log2(logvect)))
t <- t + 1
}
}
i am happy with everything up to the last line, but instead of printing the value of p_0yx%*%(log2(logvect)) to the screen i would like to store this as another vector. any ideas? i have tried doing it a similar way as in the nested loop but doesnt seem to work.
Thanks
The brief answer is to first declare a variable. Put it before everything you've posted here. I'm going to call it temp. It will hold all of the values.
temp <- numeric(1000)
Then, instead of your print line use
temp[t] <- p_0yx %*% log2(logvect)
As an aside, your code is doing some weird things. Look at the first index of p_0y. It is effectively an index to item 0, in that matrix. R starts indexing at 1. When you create the number of rows in that matrix you use max(y) - min(y). If the max is 10 and the min is 1 then there's only 9 rows. I'm betting you really wanted to add one. Also, your code is very un R-like with all of the unnecessary while loops. For example, your whole last loop (and the initialization of logvect) can be replaced with:
logvect = (p_0yx)/(p_yx)
But back to the errors.. and some more Rness... could the following code...
p_0y <- matrix(nrow = max(y) - min(y), ncol = 1)
i = min(y)
while (i <= max(y)) {
p_0y[i - min(y), ] = (length(which(y1 == i))/50)
i <- i + 1
}
maybe be replaced more correctly with?
p_0y <- numeric(max(y) - min(y) + 1)
p_0y[sort(unique(y1)) - min(y1) + 1] = table(y1)/50
p_0y <- matrix(p_0y, ncol = 1)
(similar rethinking of the rest of your code could eliminate the rest of the loops as well)