Understanding computation graph - graph

I'm trying to understand micro-ops and computation graphs for intel architecture. I have the following graph :
With the standard 6 functional units: 1 load, 1 store, 2 integer, and 2 floating point, can the machine execute the given execution graph?
My answer is yes, because there is only one integer op per each cycle. What I'm not sure about are load ops. I know that they can be pipelined, but don't know if that can be done three at a time. I would like to know if this can be done. Thanks.

Related

How to improve igraph single_paths calculation efficiency

I'm using the function all_simple_paths from the igraph R package: (1) to generate the list of all simple paths in networks (object List_paths_Mp); and (2) to calculate the total number of simple paths (object n_paths).
I'm using the function in the form:
pathsMp <- unlist(lapply(V(graphMp), function(x)all_simple_paths(graphMp, from = x)), recursive =FALSE)
List_paths_Mp <- lapply(1:length(pathsMp), function(x)as_ids(pathsMp[[x]]))
n_paths<-length(List_paths_Mp)
Where:
Mp is a square matrix with either 1 or 0 values, and graphMp is the igraph graph objected obtained through the function graph_from_adjacency_matrix.
The function does what I need, but with the increase in the number of variables and interactions the processing time to identify and store the different single paths in the network grows too much and it takes very long to get the results.
In particular, using a network with 11 variables and 60 interactions, there is a total of 146338 possible simple paths. And this already takes a long time to compute. Using a bigger network, with 13 variables and 91 interactions, causes the program to take even longer times to process (after 2 hours the function still didn't run its course, and when called to stop it crashed R).
Is there a way to increase the efficiency of the task (i.e. to get results in a faster way)? Has anyone ever encountered a similar problem and found a solution? And, I know, I could use a CPU with higher processing power, but the point is to have the function to run efficiently (as much as possible) in a normal personal computer.
Edit: here I do the calculations from the graph object, but if someone else has any idea of doing the same from the adjacency matrix, I would welcome it too!

Using results from ODEProblem while it is running

I’m currently studying the documentation of DifferentialEquations.jl and trying to port my older computational neuroscience codes for using it instead of my own, less elegant and performant, ODE solvers. While doing this, I stumbled upon the following question: is it possible to access and use the results returned from the solver as soon as the current step is returned (instead of waiting for the problem to finish)?
I’m looking for a way to e.g. plot in real-time the voltage levels of a simulated neuron, which seems like a simple enough task and one that’s probably trivial to do using already existing Julia packages but I can’t figure out how. Does it have to do anything with callbacks? Thanks in advance.
Plots.jl doesn't seem to be animating for me right now, but I'll show you the steps anyways. Yes, you can use a DiscreteCallback for this. If you make condition(u,t,integrator)=true then the affect! is called every step, and you could do that.
But, I think using the integrator interface is perfect for this case. Let me show you an example of this. Take the 2D problem from the tutorial:
using DifferentialEquations
using Plots
A = [1. 0 0 -5
4 -2 4 -3
-4 0 0 1
5 -2 2 3]
u0 = rand(4,2)
tspan = (0.0,1.0)
f(u,p,t) = A*u
prob = ODEProblem(f,u0,tspan)
Now instead of using solve, use init to get an integrator out.
integrator = init(prob,Tsit5())
The integrator interface is defined in full at its documentation page, but the basic usage is that you can step using step!. If you put that in a loop and keep stepping then that's essentially what solve does. But it also has the iterator interface, so if you do something like for integ in integrator then inside of the for loop integ will be the current state of the integrator, with values integ.u at time point integ.t. It also has all sorts of things like a plot recipe for intermediate interpolation integ(t) (this is true even when dense=false because it's free and doesn't require extra saving allocations, so feel free to use it).
So, you can do
p = plot(integrator,markersize=0,legend=false,xlims=tspan)
anim = #animate for integ in integrator
plot!(p,integrator,lw=3)
end
plot(p)
gif(anim, "test.gif", fps = 2)
and Plots.jl will give you the animated gif that adds the current interval at each step. Here's what the end plot looks like:
It colored differently in each step because it was a different plot, so you can see how it continued. Of course, you can do anything inside of that loop, or if you want more control you can manually step!(integrator) as necessary.

Oh no Another BigO one

I've been doing BigO recently, and I get the formula ok, but I've written a piece of code that takes and input and returns a time taken to complete a sort. So I have the input and time, how do I use this to classify what sort of BigO it is? I've made graphs and can see which sort they are but I can't do it using the formula? I'm not strong on maths which I think is my problem here!
For instance I get:
Size Time Operations
200 2 163648
400 1 162240
800 15 2489456
1600 6 10247376
3200 19 40858160
6400 79 165383984
12800 318 656588080
25600 1274 2624318128
51200 5059 10476803408
102400 20333 41969291968
I know that this is O(n^2) by looking at the graph and comparing, but how do I prove it?
Yes, you can sample a thousand different input sizes, and then try to derive a Big-O value from that, but you shouldn't - not only because it doesn't actually prove anything, but because that isn't the point.
The way to prove O(n^2) is to prove it on the code itself, not through experiments. The actual running time isn't important, because Big-O notation doesn't say anything about that - in simple terms, it only specifies the dominant term of whatever formula you would use to calculate the exact running time, in the sense of the number of operations executed for that function. Constants are thrown away, and so are smaller terms - the actual running time of a function might be 1000n^2+1000000n, but that's still O(n^2).
You can't mathematically prove anything from this table; the complexity might be O(1) if Time remains at 20333 for all larger values.
The best you can do is try fitting several curves to this table and selecting the best fit according to Occam's razor.
You can't prove it by looking at the timings, you can only prove it by analysing the code to see how many steps are performed. The reason for this is that the time taken is a function not only of your program but many other things outside of your control as well.
For example, who can say whether your machine didn't spend an inordinate amount of time in other processes during one particular test run of your program? This sort of thing can be minimsed to a point using statistical methods but the proof requires solid data.
What you can do is to look at some of your data points to get support for the contention that it's O(n2). Have a look at the last four entries:
Input Time
128 318
256 1274 1274 / 318 = 4.006
512 5059 5059 / 1274 = 3.971
1024 20333 20333 / 5059 = 4.019
You can see that each doubling of the input size has a multiplier effect of the time of about 4 which would tend to indicate an O(n2) property.
But this is support only. It applies only to that particular range of input values and, as stated, is subject to factors outside your control. Note also that the support would be harder to see if the time taken was not a simple one. For example, if the time function was t = n2/10 + 123n + 123456789, it would be a little harder to figure out.
Just by making a comparison between the values may not make any sense.However,if you plot a graph using this values( x-axis : input , y-axis:time),you will get a curve or a linear shape or whatever.Using this information,you can predict the BigO value of that function.Of course there may be(not always) some interrupts that affects the running of that process,but that does not last during the whole period.It is slight overhead that cannot affect the result.
In order to predict the BigO value , you will need some Calculus knowledge in order to make the analogy between the shape and BigO result.
For example,let's say that you got a linear shape and you know that it means O(n).In that point,you reached that result because you know the shape of a linear function graph and your graph looks like it.In order to reach the true proof , you have to draw both your functions curve and the graph of the mathematical function that has the closest shape to your graph.
There are some other functions like Big-Theta , Small-Omega that binds your function from upper or from lower.The mathematical function could be both of them,but as a result,your Big-O function is the closest one to that shape.

How to Build finite state machine that show modulus 4 in binary

Can someone show me how to build a finite state machine that shows modulus 4 in binary?
Well, a binary number mod 4 is going to be 0 if the last two bits are 00, so that's where you'll want to start. Just think what adding another 1 or 0 to that will do to the last two digits, and do that for each possible state.
I'll leave you with this (big) hint: think about how many possible results you can have in modulus-4. Once you know that, you'll know how many states your machine can have.

Finding area of straight line with graph (Math question but needed for flot)

Okay, so this is a straight math question and I read up on meta that those need to be written to sound like programming questions. I'll do my best...
So I have graph made in flot that shows the network usage (in bytes/sec) for the user. The data is 4 minutes apart when there is activity, and otherwise set at the start of the usage range (let's say day 1) and the end of the range (day 7). The data is coming from a CGI script I have no control over, so I'm fairly limited in what I can provide the user.
I never took trig or calculus, so I'm pretty much in over my head. What I want is for the user to have the option to click any point on the graph and see their bandwidth usage for that moment. Since the lines between real data points are drawn straight, this can be done by getting the points before and after where the user has clicked and finding the y-interval.
It took me weeks to finally get a helpful math person to explain this to me. Everyone else has insisted on trying to teach me Riemann sum techniques and all sorts of other heavy stuff that not only is confusing to me, doesn't seem necessary for the problem.
But I also want the user to be able to highlight the graph from two arbitrary points on the y-axis (time) to get the amount of network usage total during that range. I know this would be inaccurate, but I need it to be the right inaccurate using a solid equation.
I thought this was the area under the line, but experiments with much simpler graphs makes this seem just far too high. I figured out I could take the distance from y2 - y1 and multiply it by x2 - x1 and then divide by two to get the area of the graph below the line like a triangle, but again, the numbers seemed to high. (maybe they are just big numbers and I don't get this math stuff at all).
So what I need, if anyone would be really awesome enough to provide it before this question is closed down for being too pure-math, is either the name of the concept I should be researching or the equation itself. Or the bad news that I do need advanced math to get an accurate result.
I am not bad at math, just as a last note, I just am not familiar with math beyond 10th grade and so I need some place to start. All the math sites seem to keep it too simple or way over my paygrade.
If I understood correctly what you're asking (and that is somewhat doubtful), you should find what you seek in these links:
Linear interpolation
(calculating the value of the point in between)
Trapezoidal rule
(calculating the area below the "curve")
*****Edit, so we can get this over :) without much ado:*****
So I have graph made in flot that shows the network usage (in bytes/sec) for the user. The data is 4 minutes apart when there is activity, and otherwise set at the start of the usage range (let's say day 1) and the end of the range (day 7). The data is coming from a CGI script I have no control over, so I'm fairly limited in what I can provide the user.
What is a "flot" ?
Okey, so you have speed on y axis [in bytes/sec]; and time on x axis in [sec], right?
That means, that if you're flotting (I'm bored, yes :) speed over time, in linear segments, interpolating at some particular point in time you'll get speed at that particular point in time.
If you wish to calculate how much bandwidth you've spend, you need to determine the area beneath that curve. The area from point "a" to point "b" will determine the spended bandwidth in [bytes] in that time period.
It took me weeks to finally get a helpful math person to explain this to me. Everyone else has insisted on trying to teach me Riemann sum techniques and all sorts of other heavy stuff that not only is confusing to me, doesn't seem necessary for the problem.
In the immortal words of Snoopy: "Good grief !"
But I also want the user to be able to highlight the graph from two arbitrary points on the y-axis (time) to get the amount of network usage total during that range. I know this would be inaccurate, but I need it to be the right inaccurate using a solid equation.
It would not be inaccurate.
It would be actually perfectly accurate (well, apart from roundoff error in bytes :), since you're using linear interpolation on linear segments.
I thought this was the area under the line, but experiments with much simpler graphs makes this seem just far too high. I figured out I could take the distance from y2 - y1 and multiply it by x2 - x1 and then divide by two to get the area of the graph below the line like a triangle, but again, the numbers seemed to high. (maybe they are just big numbers and I don't get this math stuff at all).
"like a triangle" --> should be "like a trapezoid"
If you do deltax*(y2-y1)/2 you will get the area, yes (this works only for linear segments). This is the basis principle of trapezoidal rule.
If you're uncertain about what you're calculating use dimensional analysis: speed is in bytes/sec, time is in sec, bandwidth is in bytes. Multiplying speed*time=bandwidth, and so on.
What I want is for the user to have
the option to click any point on the
graph and see their bandwidth usage
for that moment. Since the lines
between real data points are drawn
straight, this can be done by getting
the points before and after where the
user has clicked and finding the
y-interval.
Yes, that's a good way to find that instantaneous value. When you report that value back, it's in the same units as the y-axis, so that means bytes/sec, right?
I don't know how rapidly the rate changes between points, but it's even simpler if you simply pick the closest point and report its value. You simplify your problem without sacrificing too much accuracy.
I thought this was the area under the
line, but experiments with much
simpler graphs makes this seem just
far too high. I figured out I could
take the distance from y2 - y1 and
multiply it by x2 - x1 and then divide
by two to get the area of the graph
below the line like a triangle, but
again, the numbers seemed to high.
(maybe they are just big numbers and I
don't get this math stuff at all).
To calculate the total bytes over a given time interval, you should find the index closest to the starting and ending point and multiply the value of y by the spacing of your x-points and add them all together. That will give you the total # of bytes consumed during that time interval, but there's one more wrinkle you might have forgotten.
You said that the points come in "4 minutes apart", and your y-axis is in bytes/second. Remember that units matter. Your area is the sum of bytes/second times a spacing in minutes. To make the units come out right you have to multiply by 60 seconds/minute to get the final value of bytes that you want.
If that "too high" value is still off, consider units again. It's 1024 bytes per kbyte, and 1024*1024 bytes per MB. Check the units of the values you're checking the calculation against.
UPDATE:
No wonder you're having problems. Your original question CLEARLY stated bytes/sec. Even this question is imprecise and confusing. How did you arrive at "amount of data" at a given time stamp? Are those the total bits transferred since the last time stamp? If yes, simply add the values between the start and end of the interval you want and convert to the units convenient for you.
The network usage total is not in bytes (kilo-, mega-, whatever) per second. It would be in just straight bytes (or kilo-, or whatever).
For example, 2 megabytes per second over an interval of 10 seconds would be 20 megabytes total. It would not be 20 megabytes per second.
Or do you perhaps want average bytes per second over an interval?
This would be a lot easier for you if you would accept that there is well-established terminology for the concepts that you are having trouble expressing concisely or accurately, and that these mathematical terms have been around far longer than you. Since you've clearly gone through most of the trouble of understanding the concepts, you might as well break down and start calling them by their proper names.
That said:
There are 2 obvious ways to graph bandwidth, and two ways you might be getting the bandwidth data from the server. First, there's the cumulative usage function, which for any time is simply the total amount of data transferred since the start of the measurement. If you plot this function, you get a graph that never decreases (since you can't un-download something). The units of the values of this function will be bytes or kB or something like that.
What users are typically interested is in the instantaneous usage function, which is an indicator of how much bandwidth you are using right now. This is what users typically want to see. In mathematical terms, this is the derivative of the cumulative function. This derivative can take on any value from 0 (you aren't downloading) to the rated speed of your network link (indicating that you're pushing as much data as possible through your connection). The units of this function are bytes per second, or something related like Mbps (megabits per second).
You can approximate the instantaneous bandwidth with the average data usage over the past few seconds. This is computed as
(number of bytes transferred)
-----------------------------------------------------------------
(number of seconds that elapsed while transferring those bytes)
Generally speaking, the smaller the time interval, the more accurate the approximation. For simplicity's sake, you usually want to compute this as "number of bytes transferred since last report" divided by "number of seconds since last report".
As an example, if the server is giving you a report every 4 minutes of "total number of bytes transferred today", then it is giving you the cumulative function and you need to approximate the derivative. The instantaneous bandwidth usage rate you can report to users is:
(total transferred as of now) - (total as of 4 minutes ago) bytes
-----------------------------------------------------------
4*60 seconds
If the server is giving you reports of the form "number of bytes transferred since last report", then you can directly report this to users and plot that data relative to time. On the other hand, if the user (or you) is concerned about a quota on total bytes transferred per day, then you will need to transform the (approximately) instantaneous data you have into the cumulative data. This process, known as computing the integral, is the opposite of computing the derivative, and is in some ways conceptually simpler. If you've kept track of each of the reports from the server and the timestamp, then for each time, the value you plot is the total of all the reports that came in before that time. If you're doing this in realtime, then every time you get a new report, the graph jumps up by the amount in that report.
I am not bad at math, ... I just am not familiar with math beyond 10th grade
This is like saying "I'm not bad at programming, I have no trouble with ifs and loops but I never got around to writing more than one function."
I would suggest you enrol in a maths class of some kind. An understanding of matrices and the basics of calculus gives you an appreciation of many things, and can be useful in all sorts of areas. You'll be able to understand more of Wikipedia articles and SO answers - and questions!
If you can't afford that, try to find some lecture videos or something.
Everyone else has insisted on trying to teach me Riemann sum techniques
I can't see why. You don't need them for this - though if you had learned them, I expect you would find it easier to come up with a solution. You see, Riemann sums attempt to give you a "familiar" notion of area. The sort of area you (hopefully) learned years ago.
Getting the area below your usage graph between two points will tell you (approximately) how much was used over that period.
How do you find the area of a floor plan? You break it up into rectangles and triangles, find the area of each, and add them together. You can do the same thing with your graph, basically. Someone has worked out a simple way of doing this called the trapezoidal rule. It's just a matter of choosing how to divide your graph into strips, and in your case this is easy: just use the data points themselves as dividers. (You'll also need to work out the value of the graph at the left and right ends of the region selected by the user, using linear interpolation.)
If there's anything I've said that isn't clear to you (as there may well be), please leave a comment.

Resources