How can you find the polynomial for a decimated LFSR? - encryption

I know that it if you decimate the series generated by a linear feedback shift register, you get a new series and a new polynomial. For example, if you sample every fifth element in the series generated by a LFSR with polynomial x4+x+1, you get the series generated by x2+x+1. I can find the second polynomial (x2+x+1) by brute force, which is fine for low-order polynomials. However, for higher-order polynomials, the time required to brute force it gets unreasonable.
So the question is: is it possible to find the decimated polynomial analytically?

Recently read this article and thought of it when seeing your question, hope it helps.. :oÞ
Given a primitive polynomial over GF(q), one can obtain another primitive polynomial by decimating an LFSR sequence obtained from the initial polynomial. This is demonstrated in the code below.
K := GF(7);
C := PrimitivePolynomial(K, 2);
C;
D^2 + 6*D + 3
In order to generate an LFSR sequence, we must first multiply this polynomial by a suitable constant so that the trailing coefficient becomes 1.
C := C * Coefficient(C,0)^-1;
C;
5*D^2 + 2*D + 1
We are now able to generate an LFSR sequence of length 72 - 1. The initial state can be anything other than [0, 0].
t := LFSRSequence (C, [K| 1,1], 48);
t;
[ 1, 1, 0, 2, 3, 5, 3, 4, 5, 5, 0, 3, 1, 4, 1, 6, 4, 4, 0, 1, 5, 6, 5, 2, 6, 6,
0, 5, 4, 2, 4, 3, 2, 2, 0, 4, 6, 3, 6, 1, 3, 3, 0, 6, 2, 1, 2, 5 ]
We decimate the sequence by a value d having the property gcd(d, 48)=1.
t := Decimation(t, 1, 5);
t;
[ 1, 5, 0, 6, 5, 6, 4, 4, 3, 1, 0, 4, 1, 4, 5, 5, 2, 3, 0, 5, 3, 5, 1, 1, 6, 2,
0, 1, 2, 1, 3, 3, 4, 6, 0, 3, 6, 3, 2, 2, 5, 4, 0, 2, 4, 2, 6, 6 ]
B := BerlekampMassey(t);
B;
3*D^2 + 5*D + 1
To get the corresponding primitive polynomial, we multiply by a constant to make it monic.
B := B * Coefficient(B, 2)^-1;
B;
D^2 + 4*D + 5
IsPrimitive(B);
true

from these notes: "The decimation by n>0 of a m-sequence c , denoted as c[ n],
has a period equal to N/gcd(N,n), if it is not the all-zero
sequence, its generator polynomial gˆ( x ) has roots that are nth
powers of the roots of g(x)"

Related

How do I generate a polychoric correlation matrix in R-psych

I am trying to generate a polychoric correlation matrix in R-psych for a 227 x 6 data table which I have called nepr. Importing the data from an excel spreadsheet and entering the code:
nepr=as.data.frame(nepr)
attach(nepr)
library(psych)
out=polychoric(nepr)
neprpoly=out$rho
print(neprpoly,digits=2)
generates the following error message:
>Error in if (any(lower > upper)) stop("lower>upper integration
limits"): missing value where TRUE/FALSE needed
>In addition: warning messages:
>1. In polychoric(nepr): The items do not have an equal number
of response alternatives, global set to FALSE.
>2. In qnorm(cumsum(rsum)[-length(rsum)]): NaNs produced
I was expecting the code which I entered to produce a polychoric correlation matrix based on the dataframe nepr and don't know how to interpret/ act on the error messages which I have received.
Can anyone suggest what changes I need to make to the code to address the error messages?
A sample of the dataset is as follows:
structure(list(Balance = c(4, 4, 5, 5, 3, 4, 3, 4, 2, 2, 2, 5,
2, 2, 2, 2, 1, 2, 4, 1), Earth = c(4, 5, 5, 5, 5, 5, 5, 4, 4,
4, 4, 5, 3, 4, 4, 2, 5, 4, 5, 5), Plants = c(2, 2, 2, 3, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 4), Modify = c(2, 2, 1,
1, 2, 2, 2, 2, 4, 2, 4, 2, 4, 2, 2, 2, 2, 2, 2, 2), Growth =
c(2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 4, 1, 4, 2, 2, 4, 4, 4, 1, 2),
Mankind = c(2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1,
1, 1, 2)), row.names = c(NA,20L), class = "data.frame")
The data consists of inputs of Likert scale rankings (ranked 1-5) to the items 'Balance', 'Earth', 'Plants', 'Modify', 'Growth', and 'Mankind'. There are no missing values in any cells of the 227 row x 6 item matrix; Balance, Plants, & Growth all contain the values 1-5; Earth contains the values 2-5 (no ranking of 1 recorded); Mankind contains the values 1-4 (no ranking of 5 recorded). When I ran the original data set (before reversing the valence of the last 3 columns) I was able to get a polychoric matrix with no problems even though the data contained the Earth data as it appears in the nepr data set. I assume that it is not uncommon to have similar data sets from surveys where variables do not necessarily contain the full range of response values.

How do i Interpret the coefficients of glm with binomial error distribution?

I would be happy if someone could help me understand glm with binominal error distribution.
Lets assume the following df:
year<-c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3,3, 3, 3, 3, 3, 3, 3, 3)
success<-c(1, 0, 3, 1, 1, 2, 6, 0, 1, 1, 12, 2, NA, 6, 12, 0, 10,
7, 4, 10, 13, 1, 2, 1, 18, 6, 3, 8, 3, 1, 9, 15, 6, 12,
6, 15, 13, 6, 8, 6, 2, 11, 6, 1, 12, 0, 4, 15, 0, 3, 18,
5, 6, 17, 5, 3, 17, 8, 0, 7, 12, 10, 26, 12, 4, 17, 1, 8,
2, 7, 14, 8)
no_success<-c(1, 9, 5, 4, 6, 1, 4, 4, 6, 10, 16, 4, NA, 3, NA, 3,
5, 5, 6, 10, 0, 5, 3, 10, 1, 7, 11, 8, 20, 4, 3, 3,
19, 1, 11, 4, 6, 4, 9, 4, 10, 4, 2, 8, 3, 1, 13, 3,
5, 7, 5, 9, 3, 6, 3, 4, 3, 13, 6, 5, 10, 3, 1, 0,
18, 6, 13, 0, 3, 2, 2, 2)
df<-data.frame(year,success,no_success)
df$success<-as.integer(df$success)
df$no_success<-as.integer(df$no_success)
If I want to know if there is a linear increase or decrease between year in regards to the success or no_success of a thought up treatment I apply a binominal glm:
m<- glm(cbind(success, no_success)~year,
data=df, family = "quasibinomial",
na.action=na.exclude)
summary(m)
I changed to "quasibinomial" here because of overdispersion.
From the summary I see that there is a significant effect: P: 0.0219 *
As the coefficients in a binomial glm represent log odds,
I get exp(estimate) = exp(0.3099) = 1.363
So, there is an increase in Odds of succes of 1.363 per year
My Questions are:
1.) When I exp(negative estimate) it gets always positive - this can not be correct. There must be a way to express negative relationships.
2.) When I want to visualize multiple linear models, I like to display the estimates.
In a "normal" lm I would display the estimate and confidence interval like this: divide the estimate by the mean of the observation and than substract and add the mean of observation/Std. Error times 1.96.
Estimate.mean<-exp(0.3099)/mean(df$or,na.rm=TRUE)
Std.Error.mean<-exp(0.1321)/mean(df$or,na.rm=TRUE)
low<-Estimate.mean-Std.Error.mean*1.96
high<-Estimate.mean+Std.Error.mean*1.96
If this confidence level is not touching the zero line it should be significant. The effect is significantly not greater than zero.
But here the low bound is -0.3901804 and the high bound is 1.608095. This does not appear to be a significant linear relationship despite the low p-value from the glm (0.0219).
What have I mixed up here?
I am happy for any suggestions
The "zero line" in this case is x=1 and not x=0.
Question 2:
the question is. Is there a effect that is different from zero?
But odds of 1 basicaly means zero.
Question 1:
When the estimate is exp the result can not be negative.But odds below 1 express a negative effect.
Here are some sources to calculate the confidence intervall for anyone stumbling over this post.
https://fromthebottomoftheheap.net/2018/12/10/confidence-intervals-for-glms/
https://stats.stackexchange.com/questions/304833/how-to-calculate-odds-ratio-and-95-confidence-interval-for-logistic-regression

Confidence interval of episode duration frequencies

I have the episode duration data (in days)
dur<-c(1, 2, 1, 2, 1, 3, 11, 2, 2, 3, 2, 4, 1, 2, 2, 1, 2, 10, 1, 1, 2, 2, 18, 2, 2, 2, 1, 7, 1, 1, 11, 25, 17, 2, 2, 9, 3, 3, 2, 5, 3, 2, 3, 2, 5, 363, 1, 1, 2, 2)
Which means in one instance the episode duration was 1 days, 2 days, 1 days etc etc
table(dur) summarizes the duration data (12 instances of 1 day, 20 instances of 2 days etc)
freq.table<-(table(dur)/sum(table(dur))) gives me the frequency of the observed durations of episodes (point estimates).
How can I get confidence intervals of freq.table in R? What would be the most appropriate way for this kind of data?
Edit: I am interested in estimating the CI of the frequency of episode durations of 1, 2, ..., n days
A fast and easy way to get CIs for proportions in R is the function binom.test as in
dur <- c(1, 2, 1, 2, 1, 3, 11, 2, 2, 3, 2, 4,
1, 2, 2, 1, 2, 10, 1, 1, 2, 2, 18, 2,
2, 2, 1, 7, 1, 1, 11, 25, 17, 2, 2, 9,
3, 3, 2, 5, 3, 2, 3, 2, 5, 363, 1, 1, 2, 2)
t <- table(dur)
n <- length(dur)
ci <- sapply(t, function(x) binom.test(x, n, conf.level = .95)$conf.int)
rownames(ci) <- c("lower", "upper")
print(ci)
That is supposing, that the data forming process for each episode is anything like a binomial process.
Edit after first comment
As Roland has pointed out in an earlier comment above, you have not stated the problem in inambigous statistical terms, so I made some assumptions. I suppose Roland would suggest trying to find a distribution for all the possible durations as a whole system. Considerung a mode on 2 and the existence of an observation with value 363 this is unlikely to be a common distribution like poisson or binomial etc. Knowing nothing about the data generating process I estimated a confidence interval for each observed outcome on it's own, not regarding the distribution as a whole. For each observed outcome I stated that I assumed a binomial distribution which you should look up before you use my proposition for an answer for anything serious.

Why does -1*List object return an empty list?

I was trying some operations on the List object and wanted to see some "broadcast" behavior :
x = [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x = -1*x
In [46]: x
Out[46]: []
I was expecting something like x = [1, -1, -2, -3, -4, -5, -6, -7, -8, -9].
What is actually happening?
You can only this kind of multiplication with a pandas Series (or better the underlaying numpy array). If you write something like
List = n * List
with n as an integer your list gets resized by n:
x = [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x = 3*x
print(x)
>> [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And negative numbers will remove your list entries (treated as 0 - see here).
Values of n less than 0 are treated as 0 (which yields an empty
sequence of the same type as s).
So you have to use one of these methods to multiply each list element:
NewList = [i * 5 for i in List]
for i in List:
NewList.append(i * 5)
import pandas as pd
s = pd.Series(List)
NewList = (s * 5).tolist()
You want the following:
x = [-1 * i for i in x]

How to calculate Euclidian distance between two points stored in rows of two separate matrixes?

I have two matrixes:
I would like to count the distance between point X and point Y without using a loop and in the way that when the matrix is expanded by additional columns the expression/function works.
For validation one could use:
sqrt((m1[,1] - m2[,1])^2 + (m1[,2] - m2[,2])^2 + (m1[,3] - m2[,3])^2 + (m1[,4] - m2[,4])^2 + (m1[,5] - m2[,5])^2)
The expression above gives the correct result for the distance between X and Y however once the matrix is expanded by additional columns the expression has also to be expanded and that is an unacceptable solution...
Would you be so kind and tell how to achieve this? Any help would be more than welcome. I'm stuck with this one for a while...
- between matrix is element-wise in R and the rowSums is useful for calculating the sum of along the row:
m1 <- matrix(
c(4, 3, 1, 6,
2, 4, 5, 7,
9, 0, 1, 2,
6, 7, 8, 9,
1, 6, 4, 3),
nrow = 4
)
m2 <- matrix(
c(2, 6, 3, 2,
9, 4, 1, 4,
1, 3, 0, 1,
4, 5, 0, 2,
7, 2, 1, 3),
nrow = 4
)
sqrt((m1[,1] - m2[,1])^2 + (m1[,2] - m2[,2])^2 + (m1[,3] - m2[,3])^2 + (m1[,4] - m2[,4])^2 + (m1[,5] - m2[,5])^2)
# [1] 12.529964 6.164414 9.695360 8.660254
sqrt(rowSums((m1 - m2) ^ 2))
# [1] 12.529964 6.164414 9.695360 8.660254

Resources