Hashtable which uses linear probing - hashtable

Suppose I have the following hash table which uses linear probing:
[Amy,Barry,None,None,Carlie,Darren]
which start at index 0. What is the possible value could hash(Amy) be?
My answer to this is 0, but I don't feel like Im correct. Some help would be appreciated

Related

ILP formulation for max-weighted cycle in a directed graph

I am currently trying to find the longest cycle in a !directed! graph G = (V,E). For that, I first want to (have to) formulate an integer linear program (ILP). The objective function is in my opinion quite clear: min sum_{e in E} -w_e * x_e (I formulate it as a minimization problem for future work). Here, x_e is a binary variable which is 1 if I use this specific edge (/arc) and zero otherwise; w_e is the weight of edge e.
To assure that the solution is a simple cycle, I thought about having deg_{in}^{i} = deg_{out}^{i} = 1 for all nodes i in the solution set, i.e. there is exactly one arc entering node i and exactly one arc leaving node i, given that node i is part of the cycle.
Those constraints, however, are not enough to solve this problem, as it could still happen that the solution is a set of disjoint cycles, instead of one single cycle.
Does somebody have an idea how to solve the problem?
Thank you!!
You need to add TSP-like subtour-elimination constraints. See: https://yetanothermathprogrammingconsultant.blogspot.com/2020/02/longest-path-problem.html.

Julia and system of ordinary differential equations

I want to try to solve a system of ordinary differential equations, perhaps parallelized and came across Julia and DifferentialEquations.jl. the system looks like
x'(t) = f(t)*z(t)
y'(t) = g(t)*z(t)
z'(t) = f(t)*(1-2*x(t))/(2) -g(t)*y(t)
over 10^2 < t < 10^14, but my initial boundary conditions are
x(10^14) == 0
y(10^14) == 0
z(10^14) == 0
Could someone please explain to me how to setup this problem in julia? I checked the documentation and could only find u0 as a parameter, but it doesn't give details on choosing for a right handed set of boundary conditions Many thanks!
You're looking to solve a boundary value problem (BVP). While this area is currently less developed than other areas of DifferentialEquations.jl, there are methods which exist for this which are shown in the tutorial on solving BVPs. The MIRK4 method may be the one to try.
I will note however that your timescale is quite large and may lead to numerical errors. Either using higher precision numbers (BigFloat, ArbFloat, DoubleFloat) may be required for handling that range, or you may want to rescale time in your equations so that way it better fits for standard double precision floating point numbers (Float64).

Creating an efficient function to fit a dataset

Basically I have a large (could get as large as 100,000-150,000 values) data set of 4-byte inputs and their corresponding 4-byte outputs. The inputs aren't guaranteed to be unique (which isn't really a problem because I figure I can generate pseudo-random numbers to add or xor the inputs with so that they do become unique), but the outputs aren't guaranteed to be unique either (so two different sets of inputs might have the same output).
I'm trying to create a function that effectively models the values in my data-set. I don't need it to interpolate efficiently, or even at all (by this I mean that I'm never going to feed it an input that isn't contained in this static data-set). However it does need to be as efficient as possible. I've looked into interpolation and found that it doesn't really fit what I'm looking for. For example, the large number of values means that spline interpolation won't do since it creates a polynomial per interval.
Also, from my understanding polynomial interpolation would be way too computationally expensive (n values means that the polynomial could include terms as high as pow(x,n-1). For x= a 4-byte number and n=100,000 it's just not feasible). I've tried looking online for a while now, but I'm not very strong with math and must not know the right terms to search with because I haven't come across anything similar so far.
I can see that this is not completely (to put it mildly) a programming question and I apologize in advance. I'm not looking for the exact solution or even a complete answer. I just need pointers on the topics that I would need to read up on so I can solve this problem on my own. Thanks!
TL;DR - I need a variant of interpolation that only needs to fit the initially given data-points, but which is computationally efficient.
Edit:
Some clarification - I do need the output to be exact and not an approximation. This is sort of an optimization of some research work I'm currently doing and I need to have this look-up implemented without the actual bytes of the outputs being present in my program. I can't really say a whole lot about it at the moment, but I will say that for the purposes of my work, encryption (or compression or any other other form of obfuscation) is not an option to hide the table. I need a mathematical function that can recreate the output so long as it has access to the input. I hope that clears things up a bit.
Here is one idea. Make your function be the sum (mod 232) of a linear function over all 4-byte integers, a piecewise linear function whose pieces depend on the value of the first bit, another piecewise linear function whose pieces depend on the value of the first two bits, and so on.
The actual output values appear nowhere, you have to add together linear terms to get them. There is also no direct record of which input values you have. (Someone could conclude something about those input values, but not their actual values.)
The various coefficients you need can be stored in a hash. Any lookups you do which are not found in the hash are assumed to be 0.
If you add a certain amount of random "noise" to your dataset before starting to encode it fairly efficiently, it would be hard to tell what your input values are, and very hard to tell what the outputs are even approximately without knowing the inputs.
Since you didn't impose any restriction on the function (continuous, smooth, etc), you could simply do a piece-wise constant interpolation:
or a linear interpolation:
I assume you can figure out how to construct such a function without too much trouble.
EDIT: In light of your additional requirement that such a function should "hide" the data points...
For a piece-wise constant interpolation, the constant intervals should be randomized so as to not reveal where the data point is. So for example in the picture, the intervals are centered about the data point it's interpolating. Instead, you might want to do something like:
[0 , 0.3) -> 0
[0.3 , 1.9) -> 0.8
[1.9 , 2.1) -> 0.9
[2.1 , 3.5) -> 0.2
etc
Of course, this only hides the x-coordinate. To hide the y-coordinate as well, you can use a linear interpolation.
Simply make it so that the "pointy" part isn't where the data point is. Pick random x-values such that every adjacent data point has one of these x-values in between. Then interpolate such that the "pointy" part is at these x-values.
I suggest a huge Lookup Table full of unused entries. It's the brute-force approach, having an ordered table of outputs, ordered by every possible value of the input (not just the data set, but also all other possible 4-byte value).
Though all of your data would be there, you could fill the non-used inputs with random, arbitrary, or stochastic (random whithin potentially complex constraints) data. If you make it convincing, no one could pick your real data out of it. If a "real" function interpolated all your data, it would also "contain" all the information of your real data, and anyone with access to it could use it to generate an LUT as described above.
LUTs are lightning-fast, but very memory hungry. Your case is on the edge of feasibility, requiring (2^32)*32= 16 Gigabytes of RAM, which requires a 64-bit machine to run. That is just for the data, not the program, the Operating System, or other data. It's better to have 24, just to be sure. If you can afford it, they are the way to go.

Solve Physics exercise by brute force approach

Being unable to reproduce a given result. (either because it's wrong or because I was doing something wrong) I was asking myself if it would be easy to just write a small program which takes all the constants and given number and permutes it with a possible operators (* / - + exp(..)) etc) until the result is found.
Permutations of n distinct objects with repetition allowed is n^r. At least as long as r is small I think you should be able to do this. I wonder if anybody did something similar here..
Yes, it has been done here: Code Golf: All +-*/ Combinations for 3 integers
However, because a formula gives the desired result doesn't guarantee that it's the correct formula. Also, you don't learn anything by just guessing what to do to get to the desired result.
If you're trying to fit some data with a function whose form is uncertain, you can try using Eureqa.

Big O Log problem solving

I have question that comes from a algorithms book I'm reading and I am stumped on how to solve it (it's been a long time since I've done log or exponent math). The problem is as follows:
Suppose we are comparing implementations of insertion sort and merge sort on the same
machine. For inputs of size n, insertion sort runs in 8n^2 steps, while merge sort runs in 64n log n steps. For which values of n does insertion sort beat merge sort?
Log is base 2. I've started out trying to solve for equality, but get stuck around n = 8 log n.
I would like the answer to discuss how to solve this mathematically (brute force with excel not admissible sorry ;) ). Any links to the description of log math would be very helpful in my understanding your answer as well.
Thank you in advance!
http://www.wolframalpha.com/input/?i=solve%288+log%282%2Cn%29%3Dn%2Cn%29
(edited since old link stopped working)
Your best bet is to use Newton;s method.
http://en.wikipedia.org/wiki/Newton%27s_method
One technique to solving this would be to simply grab a graphing calculator and graph both functions (see the Wolfram link in another answer). Find the intersection that interests you (in case there are multiple intersections, as there are in your example).
In any case, there isn't a simple expression to solve n = 8 log₂ n (as far as I know). It may be simpler to rephrase the question as: "Find a zero of f(n) = n - 8 log₂ n". First, find a region containing the intersection you're interested in, and keep shrinking that region. For instance, suppose you know your target n is greater than 42, but less than 44. f(42) is less than 0, and f(44) is greater than 0. Try f(43). It's less than 0, so try 43.5. It's still less than 0, so try 43.75. It's greater than 0, so try 43.625. It's greater than 0, so keep going down, and so on. This technique is called binary search.
Sorry, that's just a variation of "brute force with excel" :-)
Edit:
For the fun of it, I made a spreadsheet that solves this problem with binary search: binary‑search.xls . The binary search logic is in the second data column, and I just auto-extended that.

Resources