Converting functions to log space - math

I have to implement a couple of functions, specifically:
Issue is that I have the log(p) values to prevent underflow. How do I convert those functions to have their values be in log space?
I know for the products of p, I can convert that to sums of log(p) but I'm not sure about the others. Thanks.
My attempt in order:
log(1 - 10^log(p1))
1- prod_i,n(1 - log(pi)) Not sure??
sum_i,n(log(pi))
sum_i,n(log(pi)^wti)
sum_i,n(log(pi))/n
sum_i,n(wti*log(pi))/sum_i,n(wti)

Related

Constraint using IF statement

I am using GAMS to solve a network distribution problem and this is my first time using GAMS. I have the following constraint (see Image) which I want to write in gams but keep getting errors. Trying to figure it out using IF statement or any other way to solve it. The variable z is a binary variable, which has been declared already.
Thanks!
Image
You do not need an if statement, but can handle this with dollar conditions.
You can do it with dollar conditions in the equation (as done here), or you could write three separate equations with dollar conditions to define the domain of each equation.
E_z(u,v,i).. sum(j, z(u,v,j,i)) - sum(j, z(u,v,i,j))
=E=
0 + 1$(sameas(i,u)) - 1$(sameas(i,v));
The sameas operator is documented here. If your sets have numerical values, it might be cleaner to do a value comparison, e.g. $(i.val = u.val).
You can read more about conditional expressions in GAMS in the following link:
https://www.gams.com/latest/docs/userguides/userguide/_u_g__cond_expr.html

ImpulseDE2, matrix counts contains non-integer elements

Possibly it's a stupid question (but be patient, I'm a beginner in R's word)... I'm working with ImpulseDE2, a package designed to RNAseq data analysis along different times (see article for more information).
The running function (runImpulseDE2) requires a matrix counts and a annotation data frame. I've created both but it appears this error message:
Error in checkCounts(matCountData, "matCountData"): ERROR: matCountData contains non-integer elements. Requires count data.
I have tried some solutions and nothing seems to work (and I've not found any solution in the Internet)...
as.matrix(data)
(data + 1) > and there isn't NAs nor zero values that originate this error ($ which(is.na(data)) and $ which(data < 1), but both results are integer(0))
as.numeric(data) > and appears another error: ERROR: [Rownames of matCountData] was not given as input.
I think that's something I'm not realizing, but I'm totally locked. Every tip will be welcome!
And here is the (silly) solution! This function seems not to accept float numbers... so applying a simple round is enough to solve this error.
Thanks for your help!

Correctly setting up Shannon's Entropy Calculation in R

I was trying to run some entropy() calculations on Force Platform data and i get a warning message:
> library(entropy)
> d2 <- read.csv("c:/users/SLA9DI/Documents/data2.csv")
> entropy(d2$CoPy, method="MM")
[1] 10.98084
> entropy(d2$CoPx, method="MM")
[1] 391.2395
Warning message:
In log(freqs) : NaNs produced
I am sure it is because the entropy() is trying to take the log of a negative number. I also know R can do complex numbers using complex(), however i have not been successful in getting it to work with my data. I did not get this error on my CoPy data, only the CoPx data, since a force platform gets Center of Pressure data in 2 dimensions. Does anyone have any suggestions on getting complex() to work on my data set or is there another function that would work better to try and get a proper entropy calculation? Entropy shouldn't be that much greater in CoPx compared to CoPy. I also tried it with some more data sets from other subjects and the same thing was popping up, CoPx entropy measures were giving me warning messages and CoPy measurements were not. I am attaching a data set link so anyone can try it out for themselves and see if they can figure it out, as the data is a little long to just post into here.
Data
Edit: Correct Answer
As suggested, i tried the table(...) function and received no warning/error message and the entropy output was also in the expected range as well. However, i apparently overlooked a function in the package discretize() and that is what you are supposed to use to correctly setup the data for entropy calculation.
I think there's no point in applying the entropy function on your data. According to ?entropy, it
estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y
(emphasis mine). This means that you need to convert your data (which seems to be continuous) to count data first, for instance by binning it.

Limiting Window Size and/or Removing Specific Rows of Time Values In R

I'm trying to figure out how to observe just one particular section of the data in the graph below (e.g. 5pm onwards). I know there are basically two methods of doing this:
1) Method 1: Limiting the window size, which requires the following function:
< symbols(Data$Times, Data$y, circles=Data$z, xlim=c("5:00pm","10:00pm"))
The problem is, I get an "invalid 'xlim' value" error when I try to input the two time endpoints.
2) Method 2: Clearing out the rows in Data$Times that have values over 5pm.
The problem here is that I'm not sure how to sort the rows by earliest time -> latest time OR how to define a new variable such that TimesPM <- Data$Times>"5pm" (what I typed just now obviously did not work.)
Any ideas? Thanks in advance.
ETA: This is what I plotted:
Times<-strptime(DATA$Time,format="%I:%M%p")
symbols(Times, y, circles=z, xaxt='n', inches=.4, fg="3", bg=(a), xlab="Times", ylab="y")
axis.POSIXct(1, at=Times, format="%I:%M%p")
Both approaches have the problem that in all likelihood your datetime format will not equal the values expressed just as a character vector like "5:00pm" even after coercion with the ">" comparison operator. To get the best advice you need to present str(DATA$Times) or dput(head(DATA$Times)) or class(Data$Times) . Generally plotting functions recognize either valid date or datetime classes or their numeric representation. If the ordering operation is not working, then it raises the question whether you have a proper class. But you appear to have an axis labeling that suggests a date-time format of some sort, and that we just need to figure out what class it really is.
Because you are creating a character vector from you Time column, you probably want to apply the restriction before you send the DATA$Time vector to strptime(). You still have not offered the requested clarifications, so I have no way to give tested or even very specific code, but you might be doing something like
Times<-strptime(DATA$Time[ as.POSIXlt(DATA$Time)$hour >= 17 &
as.POSIXlt(DATA$Time)$hour <= 22 ] ,
format="%I:%M%p")

Preventing R From Rounding

How do I prevent R from rounding?
For example,
> a<-893893084082902
> a
[1] 8.93893e+14
I am losing a lot of information there. I have tried signif() and it doesn't seem to do what I want.
Thanks in advance!
(This came up as a result of a student of mine trying to determine how long it would take to count to a quadrillion at a number per second)
It's not rounding; it's just the default format for printing large (or small) numbers.
a <- 893893084082902
> sprintf("%f",a)
[1] "893893084082902.000000"
See the "digits" section of ?options for a global solution.
This would show you more digits for all numbers:
options(digits=15)
Or, if you want it just for a:
print(a, digits=15)
To get around R's integer limits, you could use the gmp package for R: http://cran.r-project.org/web/packages/gmp/index.html
I discovered this package when playing with the Project Euler challenges and needing to do factorizations. But it also provides functions for big integers.
EDIT:
It looks like this question was not really one about big integers as much as it was about rounding. But for the next space traveler who comes this way, here's an example of big integer math with gmp:
Try and multiply 1e500 * 1e500 using base R:
> 1e500 * 1e500
[1] Inf
So to do the same with gmp you first need to create a big integer object which it calls bigz. If you try to pass as.bigz() an int or double of a really big number, it will not work, because the whole reason we're using gmp is because R can't hold a number this big. So we pass it a string. So the following code starts with string manipulation to create the big string:
library(gmp)
o <- paste(rep("0", 500), collapse="")
a <- as.bigz(paste("1", o, sep=""))
mul.bigz(a, a)
You can count the zeros if you're so inclined.

Resources