Related
I am trying to generate a polychoric correlation matrix in R-psych for a 227 x 6 data table which I have called nepr. Importing the data from an excel spreadsheet and entering the code:
nepr=as.data.frame(nepr)
attach(nepr)
library(psych)
out=polychoric(nepr)
neprpoly=out$rho
print(neprpoly,digits=2)
generates the following error message:
>Error in if (any(lower > upper)) stop("lower>upper integration
limits"): missing value where TRUE/FALSE needed
>In addition: warning messages:
>1. In polychoric(nepr): The items do not have an equal number
of response alternatives, global set to FALSE.
>2. In qnorm(cumsum(rsum)[-length(rsum)]): NaNs produced
I was expecting the code which I entered to produce a polychoric correlation matrix based on the dataframe nepr and don't know how to interpret/ act on the error messages which I have received.
Can anyone suggest what changes I need to make to the code to address the error messages?
A sample of the dataset is as follows:
structure(list(Balance = c(4, 4, 5, 5, 3, 4, 3, 4, 2, 2, 2, 5,
2, 2, 2, 2, 1, 2, 4, 1), Earth = c(4, 5, 5, 5, 5, 5, 5, 4, 4,
4, 4, 5, 3, 4, 4, 2, 5, 4, 5, 5), Plants = c(2, 2, 2, 3, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 4), Modify = c(2, 2, 1,
1, 2, 2, 2, 2, 4, 2, 4, 2, 4, 2, 2, 2, 2, 2, 2, 2), Growth =
c(2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 4, 1, 4, 2, 2, 4, 4, 4, 1, 2),
Mankind = c(2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1,
1, 1, 2)), row.names = c(NA,20L), class = "data.frame")
The data consists of inputs of Likert scale rankings (ranked 1-5) to the items 'Balance', 'Earth', 'Plants', 'Modify', 'Growth', and 'Mankind'. There are no missing values in any cells of the 227 row x 6 item matrix; Balance, Plants, & Growth all contain the values 1-5; Earth contains the values 2-5 (no ranking of 1 recorded); Mankind contains the values 1-4 (no ranking of 5 recorded). When I ran the original data set (before reversing the valence of the last 3 columns) I was able to get a polychoric matrix with no problems even though the data contained the Earth data as it appears in the nepr data set. I assume that it is not uncommon to have similar data sets from surveys where variables do not necessarily contain the full range of response values.
I have to do the following:
I have a vector, let as say
x <- c(1, 1, 2, 3, 3, 3, 4, 4, 5, 5, 3, 2, 11, 1, 3, 3, 4, 1)
I have to subset the remainder of a vector after 1, 2, 3, 4 occurred at least once.
So the subset new vector would only include 4, 5, 5, 3, 2, 11, 1, 3, 3, 4, 1.
I need a relatively easy solution on how to do this. It might be possible to do an if and while loop with breaks, but I am kinda struggling to come up with a solution.
Is there a simple (even mathematical way) to do this in R?
Use sapply to find where each predefined number occurs first time.
x[-seq(max(sapply(1:4, function(y) which(x == y)[1])))]
# [1] 4 5 5 3 2 11 1 3 3 4 1
Data
x <- c(1, 1, 2, 3, 3, 3, 4, 4, 5, 5, 3, 2, 11, 1, 3, 3, 4, 1)
You can use run length encoding for this
x = c(1, 1, 2, 3, 3, 3, 4, 4, 5, 5, 3, 2, 11, 1, 3, 3, 4, 1)
encoded = rle(x)
# Pick the first location of 1, 2, 3, and 4
# Then find the max index location
indices = c(which(encoded$values == 1)[1],
which(encoded$values == 2)[1],
which(encoded$values == 3)[1],
which(encoded$values == 4)[1])
index = max(indices)
# Find the index of x corresponding to your split location
reqd_index = cumsum(encoded$lengths)[index-1] + 2
# Print final split value
x[reqd_index:length(x)]
The result is as follows
> x[reqd_index:length(x)]
[1] 4 5 5 3 2 11 1 3 3 4 1
I have a vector that looks like this:
c(1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5..)
I want to get the index of when the element changes, i.e. (1,5,9,...)
I know how to do it with a for loop, but I am trying a faster way as my vector is very large.
Thanks,
Try
which(c(TRUE,diff(v1)!=0))
Or
match(unique(v1), v1)
Or if the vector is sorted
head(c(1, findInterval(unique(v1), v1)+1),-1)
data
v1 <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,
4, 4, 5, 5, 5, 5, 5)
Another fun approach:
v1 <- c(1, 1, 2, 3, 4, 4, 5, 6, 7, 7, 7, 8)
head(c(1, cumsum(rle(v1)$lengths) + 1), -1)
Or if you have magrittr then it can become
library(magrittr)
v1 %>%
rle %>%
.$lengths %>%
cumsum %>%
add(1) %>%
c(1, .) %>%
head(-1)
Result: 1 3 4 5 7 8 9 12
Might look weird but it's fun to think that through :)
Explanation: cumsum(rle(v1)$lengths) gets you almost all the way there, but it'll give you the index of where a sequence ends rather than where the next sequence starts, so that's why we add one to each element, append the index 1, and remove the last element.
I have the 2 tables as below
subj <- c(1, 1, 1, 2, 2, 2, 3, 3, 3)
gamble <- c(1, 2, 3, 1, 2, 3, 1, 2, 3)
ev <- c(4, 5, 6, 4, 5, 6, 4, 5, 6)
table1 <- data.frame(subj, gamble, ev)
subj2 <- c(1, 2, 3)
gamble2 <- c(1, 3, 2)
table2 <- data.frame(subj2, gamble2)
I want to merge the two tables by gamble, only choose the gamble from table 1 which has the same number to gamble in table 2. The expected output is as follows:
sub gamble ev
1 1 4
2 3 6
3 2 5
You are looking for merge
merge(table1, table2, by.x=c("subj", "gamble"), by.y=c("subj2", "gamble2"), all=FALSE, sort=TRUE)
edited as per Ananda's helpful observation
I know that it if you decimate the series generated by a linear feedback shift register, you get a new series and a new polynomial. For example, if you sample every fifth element in the series generated by a LFSR with polynomial x4+x+1, you get the series generated by x2+x+1. I can find the second polynomial (x2+x+1) by brute force, which is fine for low-order polynomials. However, for higher-order polynomials, the time required to brute force it gets unreasonable.
So the question is: is it possible to find the decimated polynomial analytically?
Recently read this article and thought of it when seeing your question, hope it helps.. :oΓ
Given a primitive polynomial over GF(q), one can obtain another primitive polynomial by decimating an LFSR sequence obtained from the initial polynomial. This is demonstrated in the code below.
K := GF(7);
C := PrimitivePolynomial(K, 2);
C;
D^2 + 6*D + 3
In order to generate an LFSR sequence, we must first multiply this polynomial by a suitable constant so that the trailing coefficient becomes 1.
C := C * Coefficient(C,0)^-1;
C;
5*D^2 + 2*D + 1
We are now able to generate an LFSR sequence of length 72 - 1. The initial state can be anything other than [0, 0].
t := LFSRSequence (C, [K| 1,1], 48);
t;
[ 1, 1, 0, 2, 3, 5, 3, 4, 5, 5, 0, 3, 1, 4, 1, 6, 4, 4, 0, 1, 5, 6, 5, 2, 6, 6,
0, 5, 4, 2, 4, 3, 2, 2, 0, 4, 6, 3, 6, 1, 3, 3, 0, 6, 2, 1, 2, 5 ]
We decimate the sequence by a value d having the property gcd(d, 48)=1.
t := Decimation(t, 1, 5);
t;
[ 1, 5, 0, 6, 5, 6, 4, 4, 3, 1, 0, 4, 1, 4, 5, 5, 2, 3, 0, 5, 3, 5, 1, 1, 6, 2,
0, 1, 2, 1, 3, 3, 4, 6, 0, 3, 6, 3, 2, 2, 5, 4, 0, 2, 4, 2, 6, 6 ]
B := BerlekampMassey(t);
B;
3*D^2 + 5*D + 1
To get the corresponding primitive polynomial, we multiply by a constant to make it monic.
B := B * Coefficient(B, 2)^-1;
B;
D^2 + 4*D + 5
IsPrimitive(B);
true
from these notes: "The decimation by n>0 of a m-sequence c , denoted as c[ n],
has a period equal to N/gcd(N,n), if it is not the all-zero
sequence, its generator polynomial gΛ( x ) has roots that are nth
powers of the roots of g(x)"