A variant of the 2D knapsack or square packing - r

I am dealing with an optimisation issue, which I classified as a combinatorial problem. Now, I know this is a 2D variant of the knapsack problem, but please bear with me:
If I have an area that is modeled as a grid comprised of equal size cells, how to place a certain number of square objects of different sizes, on this grid area, if every object has its cost and its benefit and the goal is to have an arrangement of the objects that has the maximum Benefit/Cost ratio:
Object 1: 1x1 square, cost = 800, value= 2478336
Object 2: 2x2 square cost= 2000 value = 7565257
Object 3: 3x3 square cost= 3150 value= 14363679
The object 3 has the best value/cost ratio, so the approach would be a greedy one I guess, to first place as much of the bigger squares as possible, but still there are many optimal solutions depending on the size of the area.
Also, the square objects cannot overlap.
I am using R for this, and the package adagio has algorithms for the single and multiple knapsack, but not for a 2D knapsack problem. Because I am very new in optimization and programming, I am not sure if there is way of solving this problem with R, can someone please help?
Thanks!

Firstly, I'm not an expert in R and adagio. Secondly, I think that your problem is not exactly 2d knapsack, it looks like a variant of packing problem, so it requires a different approach.
So, first, check this awesome list of R optimization packages, especially the following sections:
Specific Applications in Optimization (for example, tabu search could be useful for you)
Mathematical Programming Solvers/Interfaces to Open Source Optimizers (lpsolve definitely could solve your task)
Global and Stochastic Optimization (some of this packages could be used to solve your task)
In case if you're not tied to R, consider Minizinc as a solver. It's very easy to install/use and it's pretty efficient in terms of memory/time consumption. Moreover, there is a bunch of great examples how to use it.

Related

raster::gridDistance() fails on medium to large raster files

I have some moderate sized raster files (max size ~190 MB) that I would like to calculate grid distances for using raster::gridDistance()
I'm finding that the operation is slow and/or R crashes for the largest of my files. Please note: I'm not seeking memory management advice (e.g. maxing out memory.limit(), breaking into smaller rasters or pursuing parallel processing methods) as these are sidestepping my issue. If grid distances should not be attempted for 190+ MB size files, then I will just break the job into more manageable pieces.
The raster::gridDistance() documentation mentions that I can try to solve "errors in the case of complex objects spread over different chunks... by varying the chunk size, see function setOptions()" and that "Additional distance measures and options (directions, cost-distance) are available in the {gdistance} package", but I have been hesitant to pursue these without better understanding the limitations/considerations.
Thanks to this question R - terra::distance() equivalent of raster::gridDistance(..., origin = x, omit = y) I understand that there is an alternative method using terra::gridDistance(), but I am not able to discern if the operation is any more efficient or suitable for my needs than raster::gridDistance()
I haven't posted a reprex or session info as my question is really as follows:
Is terra::gridDistance() (or some other alternative like those offered by {gdistance}) really a more efficient (faster) or customizable way for calculating a grid distance using moderate-large raster files?
If not, what are considerations for changing how the grid distance is calculated (varying chunk size or other means) using raster::gridDistance() and setOptions()?
If there is interest, I can reformat my question so that it better fits guidelines with a reprex etc. Also, I am posting the question here rather than Geographic Information Systems because the original linked question was posted here.
I understand that there is an alternative method using terra::gridDistance(), but I am not able to discern if the operation is any more efficient or suitable for my needs
Well, did you try it? That could have been more efficient than writing a long question.
The help file does not mention the limitations that raster::gridDistance has, so you should be good to go. But note that the method was renamed to terra::gridDist()
The "terra" package is the replacement for "raster" package; so "terra" is the best starting point more generally, I think.

R equivalent to matlab griddata, scatteredInterpolant, and/or TriScatteredInterp

We do a lot of full field 3D numerical simulations (CFD, FEA, etc.). The solutions take a long time to run. We often interpolate from solutions rather than rerun every case. We also interpolate between multiple solutions, which leads to even higher dimensional interpolation (like adding time, so x,y,z,t,v).
Matlab does a great job of reading data V at irregular grid of X,Y,Z coordinates, and interpolating from V using griddata, scatterdInterpolan, and/or TriScatteredInterp. For a variety of reasons, I've switched to R. This remains one key area I've not been able to find as good R equivalent. 'akima' only does x,y,V (not, x,y,z,V, much less even higher dimensions like x,y,z,t,v).
The next best thing I've found has been 'krigging'. But krigging behaves more like model fitting and projection, and often does not behave well between irregular grid points. So it's not nearly as robust as simple direct linear interpolation.
Matlab has had griddata for several decades. It's hard to believe R doesn't have an equivalent out there. Any suggestions? Or is there at least a way to use krigging to yield effectively the same result as a direct linear interpolation?
Jonathan
You might start by looking at the package "tripack" to do Delaunay triangulation, which gives you the first step in duplicating scatteredInterpolant().
R interpp() is equivalent to MATLAB scatteredInterpolant().

Handling extremely small numbers

I need to find a way to handle extremely small numbers in R, particularly in order to take the log of extremely small numbers. According to the R-manual, “on a typical R platform the smallest positive double is about 5e-324.” Well, I need to deal with numbers even smaller (at least as small as 10^-350). If R is incapable of doing this, I was wondering if there is a way I can use a program that can do this (such as Matlab or Mathematica) from R.
Specifically, I am computing a matrix of probabilities, and some of these probabilities are so small that R does not distinguish them from 0. The reason I know this is because each probability is the product of two other probabilities; so I’ll have p(x)=10^-300, p(y)=10^-50, and then p(x)*p(y)=0. I’d like to be able to do these computations, take the log of the resultant very small number (-805.905 for my example, according to Mathematica), and then continue working with the log values in R.
So to be more detailed, I have a matrix of values for p(x), a matrix of values for p(y), both computed using dnorm, and I’m computing the product. In many cases, R is capable of evaluating p(x) and p(y), but the p(x)*p(y) is too small. In a few cases, though, even the p(x) or p(y) value itself is too small, and is itself just equated to 0 in R.
I’ve seen that there is stuff out there for calling R from Mathematica, but not much pertaining to calling Mathematica from R. I’d honestly prefer to do the latter than the former here. So if any one either knows how to do this (either employing Mathematica or Matlab or something else in R) or has another solution to this issue, I’d greatly appreciate it.
Note that I realize there are a few other threads on this topic, discussing such things as using the Brobdingnag package to deal with small numbers, but these do not appear applicable here.

Graph partitioning optimization

The problem
I have a set of locations on a plane (actually they are pins in a KML file) and I want to partition this graph into subgraphs. Connectivity is pretty good - as with all real world road networks - so I assume that if two locations are close they have some kind of connection. The resulting set of subgraphs should adhere to these constraints:
Every node has to be covered by a subgraph
Every node should be in exactly 1 subgraph
Every node within a subgraph should be close to each other (L2 norm distances)
Every subgraph should contain at least 5 locations
The amount of subgraphs should be minimal
Right now the amount of locations is no more than 100 so I thought about brute forcing through every possibility but this obviously won't scale well.
I thought about using some k-Nearest-Neighbors algorithm (e.g. using QuickGraph) but I can't get my head around where to start and how to extend/shrink the subgraphs on the way. Maybe it's possible to map this problem to another problem that can easily be solved with some numerical procedure (e.g. Simplex) ...
Maybe someone has experience in this kind of optimization problems and is willing to help me find a solution? I don't have access to Mathematica/Matlab or the like ... but sufficient .NET programming skills and hmm Excel :-)
Thanks a lot!
As soon as there are multiple criteria that need to be appeased in the best possible way simultanously, it is usually starting to get difficult.
A numerical solution could work as follows: You could define yourself a utility function, that maps partitionings of your locations to positive real values, describing how "good" a partition is by assigning it a "rating" (good could be high "bad" could be near zero).
Once you have such a function assigning partitions their according "values", you simply need to optimize it and then you hopefully obtain a good solution if you defined your utility function reasonably. Evolutionary algorithms are good at that task since your utility function is probably analytically too complex to solve due to its discrete nature.
The problem is then only how you assign "values" to partitions via this utility function. This is then your task. It can be done for example by weighing each criterion with a factor and summing the results up, or even more complex functions (least squares etc.). The factors you use in the definition of the utility function are tuning parameters and can be varied until the result seems to be good.
Some CA software wold help a lot for testing if you can get your hands on one, bit I guess to obtain a black box solver for your partitioning problem, you need to implement the complete procedure yourself using a language of your choice.

R function to solve large dense linear systems of equations?

Sorry, maybe I am blind, but I couldn't find anything specific for a rather common problem:
I want to implement
solve(A,b)
with
A
being a large square matrix in the sense that command above uses all my memory and issues an error (b is a vector with corresponding length). The matrix I have is not sparse in the sense that there would be large blocks of zero etc.
There must be some function out there which implements a stepwise iterative scheme such that a solution can be found even with limited memory available.
I found several posts on sparse matrix and, of course, the Matrix package, but could not identify a function which does what I need. I have also seen this post but
biglm
produces a complete linear model fit. All I need is a simple solve. I will have to repeat that step several times, so it would be great to keep it as slim as possible.
I already worry about the "duplication of an old issue" and "look here" comments, but I would be really grateful for some help.

Resources