L1 regularisation in cplex - convex-optimization

I am trying to perform optimization which uses the L1 regularisation method.
However, I am using cplex and I do not see an obvious way of performing L1 regularisation when I use cplex. Can someone please help?

Let me start with the example curve fitting from Model Building
Without regularization:
int n=...;
range points=1..n;
float x[points]=...;
float y[points]=...;
// y== b*x+a
dvar float a;
dvar float b;
minimize sum(i in points) (b*x[i]+a-y[i])^2;
subject to
{
}
execute
{
writeln("b=",b);
writeln("a=",a);
}
The Lasso Version (L1 regularisation) would be:
int n=...;
range points=1..n;
float x[points]=...;
float y[points]=...;
float lambda=0.1;
// y== b*x+a
dvar float a;
dvar float b;
minimize sum(i in points) (b*x[i]+a-y[i])^2+lambda*(abs(a)+abs(b));
subject to
{
}
execute
{
writeln("b=",b);
writeln("a=",a);
}

Related

Segmentation fault for big input (recursive function)

`#include <bits/stdc++.h>
using namespace std;
#define ll long long
ll solve(ll a, ll b, ll i){
//base case
if (a == 0) return i;
if (b > a) return i+1;
//recursive case
if (b == 1) {
return solve(a,b+1,i+1);
}
ll n = solve(a, b+1, i+1);
ll m = solve(a/b, b, i+1);
return min(n,m);
}
int main(){
int t;
cin >> t;
while(t--){
ll a, b;
cin >> a >> b;
cout << solve(a, b, 0)<< endl;
}
}`
Basically question is from codeforces (1485A). The problem is that when I give some big input like 50000000 a and 5 for b, this gives me segmentation fault error while the code works fine for smaller inputs. Please help me solve it.
Using recursion is a terrible choice. And you need to make all obvious algorithmic optimizations.
The key insight is that for any path that divides before increasing b, there is a path that is as good or better that does not divide before increasing b. Why divide by a smaller number when you can divide by a bigger one if you're going to use the steps to increase the number anyway?
With that insight, and removing recursion, the problem is trivial to solve:
#include <iostream>
unsigned long long divisions(unsigned long long a, unsigned long long b)
{
// figure out how many divide operations we need
int ops = 0;
while (a > 0)
{
a/=b;
ops++;
}
return ops;
}
unsigned long long ops(unsigned long long a, unsigned long long b)
{
// figure out how many divides we need with the smallest possible b
unsigned long long min_ops = (b == 1) ? (1 + divisions(a, b+1)) : divisions(a, b);
// try every sensible larger b to see if it takes fewer operations
for (unsigned long long num_inc = 1; num_inc <= min_ops; ++num_inc)
{
unsigned long long ops = num_inc + divisions (a, b + num_inc);
if (ops < min_ops)
min_ops = ops;
}
return min_ops;
}
int main(void)
{
int t;
std::cin >> t;
while (t--)
{
unsigned long long a, b;
std::cin >> a >> b;
std::cout << ops(a, b) << std::endl;
}
}
Again, the lesson is that you must make algorithmic optimizations before you start coding. No amount of great coding will make a terrible algorithm work well.
By the way, there was a huge hint on the problem page. Something in the problem tags gives the key optimization away.

Method in a contraint

I've a cplex constraint in the form of a a binary variable multiply for a number >= to another number.
The second number is complex to calculate, I think I need a method to compute it, it is possible in cplex write a constraint like this:
k*y[i] > method(parameter1,parameter2)
In the method I need to access to binary variables values.
Thanks a lot for replies.
Let me try this oulipo challenge.
Write an OPL models that works and that contains what you wrote.
Could this help?
float k=1.2;
dvar boolean y[1..1];
int parameter1=1;
int parameter2=2;
dvar boolean x;
dexpr float method[i in 1..10,j in 1..10]=x*(i+j);
subject to
{
forall(i in 1..1)
k*y[i] >= method[parameter1,parameter2];
}
PS: with your later comments:
float k=1.2;
dvar boolean y[1..1];
int parameter1=1;
int parameter2=2;
dvar boolean x;
float methodresults[i in 1..10,j in 1..10]; //=x*(i+j);
range r=1..10;
execute
{
function method(i,j)
{
return i+j;
}
for(var i in r) for (var j in r) methodresults[i][j]=method(i,j);
}
subject to
{
forall(i in 1..1)
k*y[i] >= x*methodresults[parameter1,parameter2];
}
If you are using a script in a .mod file, then you can define a function whithin an execute block [1]. These blocks define pre-processing or post-processing instructions written in ILOG Script [2]. Here's a trivial example from the documentation at https://www.ibm.com/support/knowledgecenter/SSSA5P_12.9.0/ilog.odms.ide.help/OPL_Studio/opllangref/topics/opl_langref_script_struct_statements_function.html.
execute {
function add(a, b) {
return a+b
}
writeln(add(1,2));
}
[1] https://www.ibm.com/support/knowledgecenter/SSSA5P_12.9.0/ilog.odms.ide.help/OPL_Studio/opllanguser/topics/opl_languser_script_intro_presynt.html
[2] https://www.ibm.com/support/knowledgecenter/SSSA5P_12.9.0/ilog.odms.ide.help/OPL_Studio/opllanguser/topics/opl_languser_script.html

Recursive Vs iterative Traversal of a BST

If I do recursive traversal of a binary tree of N nodes, it will occupy N spaces in execution stack.
If i use iteration , i will have to use N spaces in an explicit stack.
Question is do we say that recursive traversal is also using O(N) space complexity like iterative one is using?
I am talking in terms of running traversal code on some platform which bounds me by memory limits.
Also i am not talking of directly implementing iteration (in which one can say either of the approaches is fine), I am implementing algorithm for KthSmallestElement() in a BST which uses sort of traversal through the BST.
Should i use iterative approach or recursive approach in terms of space complexity, so that my code doesn't fail in space limits?
Putting it clearly:
Here is what i implemented:
int Solution::kthsmallest(TreeNode* root, int k) {
stack<TreeNode *> S;
while(1)
{
while(root)
{
S.push(root);
root=root->left;
}
root=S.top();
S.pop();
k--;
if(k==0)
return root->val;
root=root->right;
}
}
Here is what my friend implemented:
class Solution {
public:
int find(TreeNode* root, int &k) {
if (!root) return -1;
// We do an inorder traversal here.
int k1 = find(root->left, k);
if (k == 0) return k1; // left subtree has k or more elements.
k--;
if (k == 0) return root->val; // root is the kth element.
return find(root->right, k); // answer lies in the right node.
}
int kthsmallest(TreeNode* root, int k) {
return find(root, k); // Call another function to pass k by reference.
}
};
SO Which of the two is better & how?
If you care about memory use, you should try to ensure that your tree is balanced, i.e. that its depth is smaller than the number of nodes. A perfectly balanced binary tree with N nodes has depth log2N (rounded up).
It is important because the memory needed to visit all nodes in a binary tree is proportional to the depth of the tree, not to the number of nodes as you erroneously think; the recursive or iterative program needs to "remember" the path from the root to the current node, not other previously visited nodes.

Rcpp keeps running for a seemingly simple task

I've been thinking about it all day and still cannot figure out why this happens. My objective is simple: STEP1, generate a function S(h,p); STEP2, numerically integrate S(h,p) with respect to p by trapezoidal rule and obtain a new function SS(h). I wrote the code and source it by sourceCpp, and it successfully generated two functions S(h,p) and SS(h) in R. But when I tried to test it by calculating SS(1), R just kept running and never gave the result, which is weird because the calculation amount is not that big. Any idea why this would happen?
My code is here:
#include <Rcpp.h>
using namespace Rcpp;
//generate the first function that gives S(h,p)
// [[Rcpp::export]]
double S(double h, double p){
double out=2*(h+p+h*p);
return out;
}
//generate the second function that gives the numerically integreation of S(h,p) w.r.t p
//[[Rcpp::export]]
double SS(double h){
double out1=0;
double sum=0;
for (int i=0;i<1;i=i+0.01){
sum=sum+S(h,i);
}
out1=0.01/2*(2*sum-S(h,0)-S(h,1));
return out1;
}
The problem is that you are treating i as if it were not an int in this statement:
for (int i=0;i<1;i=i+0.01){
sum=sum+S(h,i);
}
After each iteration you are attempting to add 0.01 to an integer, which is of course immediately truncated towards 0, meaning that i is always equal to zero, and you have an infinite loop. A minimal example highlighting the problem, with a couple of possible solutions:
#include <Rcpp.h>
// [[Rcpp::export]]
void bad_loop() {
for (int i = 0; i < 1; i += 0.01) {
std::printf("i = %d\n", i);
Rcpp::checkUserInterrupt();
}
}
// [[Rcpp::export]]
void good_loop() {
for (int i = 0; i < 100; i++) {
std::printf("i = %d\n", i);
Rcpp::checkUserInterrupt();
}
}
// [[Rcpp::export]]
void good_loop2() {
for (double j = 0.0; j < 1.0; j += 0.01) {
std::printf("j = %.2f\n", j);
Rcpp::checkUserInterrupt();
}
}
The first alternative (good_loop) is to scale your step size appropriately -- looping from 0 through 99 by 1 takes the same number of iterations as looping from 0 to 0.99 by 0.01. Additionally, you could just use a double instead of an int, as in good_loop2. At any rate, the main takeaway here is that you need to be more careful about choosing your variable types in C++. Unlike R, when you declare i to be an int it will be treated like an int, not a floating point number.
As #nrussell pointed out very expertly, there is an issue with treating i as an int when the type held is a double. The goal of posting this answer is to stress the need to avoid using a double or float as a loop incrementer. I've opted to post it as an answer instead of a comment for readability.
Please note, the loop increment should not ever be given as a double or a float due to precision issues. e.g. it is hard to get i = .99 since i = 0.981111111 et cetera...
Instead, I would opt to have the loop be processed as an int and convert it to a double / float as soon as possible, e.g.
for (int i=0; i < 100; i++){
// Make sure to use double division
// (e.g. either numerator or denominator is a floating / double)
sum += S(h, i/100.0);
}
Further notes:
RcppArmadillo and C++ division issue
Using float / double as a loop variable

boosting parallel reduction OpenCL

I have an algorithm, performing two-staged parallel reduction on GPU to find the smallest elemnt in a string. I know that there is a hint on how to make it work faster, but I don't know what it is. Any ideas on how I can tune this kernel to speed my program up? It is not necessary to actually change algorithm, may be there are other tricks. All ideas are welcome.
Thank you!
__kernel
void reduce(__global float* buffer,
__local float* scratch,
__const int length,
__global float* result) {
int global_index = get_global_id(0);
float accumulator = INFINITY
while (global_index < length) {
float element = buffer[global_index];
accumulator = (accumulator < element) ? accumulator : element;
global_index += get_global_size(0);
}
int local_index = get_local_id(0);
scratch[local_index] = accumulator;
barrier(CLK_LOCAL_MEM_FENCE);
for(int offset = get_local_size(0) / 2;
offset > 0;
offset = offset / 2) {
if (local_index < offset) {
float other = scratch[local_index + offset];
float mine = scratch[local_index];
scratch[local_index] = (mine < other) ? mine : other;
}
barrier(CLK_LOCAL_MEM_FENCE);
}
if (local_index == 0) {
result[get_group_id(0)] = scratch[0];
}
}
accumulator = (accumulator < element) ? accumulator : element;
Use fmin function - it is exactly what you need, and it may result in faster code (call to built-in instruction, if available, instead of costly branching)
global_index += get_global_size(0);
What is your typical get_global_size(0)?
Though your access pattern is not very bad (it is coalesced, 128byte chunks for 32-warp) - it is better to access memory sequentially whenever possible. For instance, sequential access may aid memory prefetching (note, OpenCL code can be executed on any device, including CPU).
Consider following scheme: each thread would process range
[ get_global_id(0)*delta , (get_global_id(0)+1)*delta )
It will result in fully sequential access.

Resources