I am trying to learn to use profvis. Here is a link with a reproducible example:
profvis example
I am not able to see why there are TWO numbers for Memory ie. -3.5 and 9.5 (where Time = 1290) in the first example in this page.
Can someone explain how to interpret the 2 numbers for Memory?
The positive numbers are allocated memory and the negative numbers are deallocated memory (between the previous and current sample).
From the docs:
Memory: Memory allocated or deallocated (for negative numbers) for a
given call stack. This is represented in megabytes and aggregated over
all the call stacks over the code in the given row
Also note that
Interpreting this information can be a little tricky, because it does
not necessarily reflect memory allocated and deallcated at that line
of code. The sampling profiler records information about memory
allocations that happen between the previous sample and the current
one. This means that the allocation/deallocation values on that line
may have actually occurred in a previous line of code.
Related
Consensus on set.seed in R is that that it effectively generates a long sequence of pseudo-random numbers, pre-determined by the seed. Then the first call you make to this sequence (with the first non-deterministic function you use) takes the first batch from that sequence, the second call takes the next batch, so forth.
I am wondering what the limits to this are. Specifically, what happens when you get to the end of that long sequence? Let's say, after setting a seed, you then sample from the first 100 integers repeatedly. Would there come a point where you start generating the same samples (in the same order) as you were seeing at the beginning? How long would this take? (Does it depend on the seed?) If not, how would reaching the 'end' of the sequence and presumably circling back to the beginning manifest?
The ?RNGkind help page in R gives more details on the default random number generator, the "Mersenne Twister" algorithm:
"Mersenne-Twister": From Matsumoto and Nishimura (1998); code
updated in 2002. A twisted GFSR with period 2^19937 - 1 and
equidistribution in 623 consecutive dimensions (over the
whole period). The ‘seed’ is a 624-dimensional set of 32-bit
integers plus a current position in that set.
As stated there, the "period" (the length of time it takes to get back to the beginning and start repeating values is 2^19937-1, or approximately 10^(19937/log2(10)) = 10^6001.
If the size of your "batches" happened to line up exactly with the period, then you would indeed start getting the same batches again.
I'm not sure how many pseudorandom samples R uses to pick a sample of size 1 from a set. Ideally it would be only 1 (so your "batch size" would be 1), but it might be more depending on the generality/complexity of the sampling algorithm.
I know that runif() translates more or less directly from the PRNG, so a sequence of runif() calls would indeed repeat exactly.
My code eats up to 3GB of memory at a single time. I figured it out using gc():
gc1 <- gc(reset = TRUE)
graf(...) # the code
gc2 <- gc()
cat(sprintf("mem: %.1fMb.\n", sum(gc2[,6] - gc1[,2])))
# mem: 3151.7Mb.
Which I guess means that there is one single time, when 3151.7 MB are allocated at once.
My goal is to minimize the maximum memory allocated at any single time. How do I figure out which part of my code is reposponsible for the maximum usage of those 3GB of memory? I.e. the place where those 3GB are allocated at once.
I tried memory profiling with Rprof and profvis, but both seem to show different information (which seems undocumented, see my other question). Maybe I need to use them with different parameters (or use different tool?).
I've been looking at Rprofmem... but:
in the profmem vignette they wrote: "with utils::Rprofmem() it is not possible to quantify the total memory usage at a given time because it only logs allocations and does therefore not reflect deallocations done by the garbage collector."
how to output the result of Rprofmem? This source speaks for itself: "Summary functions for this output are still being designed".
My code eats up to 3GB of memory at a single time.
While it looks like your code is consuming a lot of RAM at once by calling one function you can break down the memory consumption by looking into the implementation details of the function (and its sub calls) by using RStudio's built-in profiling (based on profvis) to see the execution time and rough memory consumption. Eg. if I use my demo code:
# graf code taken from the tutorial at
# https://rawgit.com/goldingn/intecol2013/master/tutorial/graf_workshop.html
library(dismo) # install.packages("dismo")
library(GRaF) # install_github('goldingn/GRaF')
data(Anguilla_train)
# loop to call the code under test several times to get better profiling results
for (i in 1:5) {
# keep the first n records of SegSumT, SegTSeas and Method as covariates
covs <- Anguilla_train[, c("SegSumT", "SegTSeas", "Method")]
# use the presence/absence status to fit a simple model
m1 <- graf(Anguilla_train$Angaus, covs)
}
Start profiling with the Profile > Start Profiling menu item, source the above code and stop the profiling via the above menu.
After Profile > Stop Profiling RStudio is showing the result as Flame Graph but what you are looking for is hidden in the Data tab of the profile result (I have unfolded all function calls which show heavy memory consumption):
The numbers in the memory column indicate the memory allocated (positive) and deallocated (negative numbers) for each called function and the values should include the sum of the whole sub call tree + the memory directly used in the function.
My goal is to minimize the maximum memory allocated at any single time.
Why do you want to do that? Do you run out-of-memory or do you suspect that repeated memory allocation is causing long execution times?
High memory consumption (or repeated allocations/deallocations) often come together with a slow execution performance since copying memory costs time.
So look at the Memory or Time column depending on your optimization goals to find function calls with high values.
If you look into the source code of the GRaF package you can find a loop in the graf.fit.laplace function (up to 50 "newton iterations") that calls "slow" R-internal functions like chol, backsolve, forwardsolve but also slow functions implemented in the package itself (like cov.SE.d1).
Now you can try to find faster (or less memory consuming) replacements for these functions... (sorry, I can't help here).
PS: profvis uses Rprof internally so the profiling data is collected by probing the current memory consumption in regular time intervals and counting it for the currently active function (call stack).
Rprof has limitations (mainly not an exact profiling result since the garbage collector triggers at non-deterministic times and the freed memory is attributed to the function the next probing interval break stops at and it does not recognize memory allocated directly from the OS via C/C++ code/libraries that bypasses R's memory management API).
Still it is the easiest and normally good enough indication of memory and performance problems...
For an introduction into profvis see: For https://rstudio.github.io/profvis/
i'm plotting real time usb-received data in Qt with Qcustomplot. for better speed, i want to allocate memory before data receive start. this is part of my code:
ui->plot_platform->graph(0)->data().data()->clear();
QVector<QCPGraphData> add_data(x);
ui->plot_platform->graph(0)->addData(add_data);
i do this at the start of data receiving. how i can determine size of x? it means how i can find max memory size that i can get from windows? is your solution safe from crashing?
Even USB has a limited throughput, and thus you can estimate the maximum number of
points (per time unit):=x which you want to reserve in QVector<QCPGraphData> add_data(x).
On the other hand as you allocate memory for QVector<QCPGraphData> on the stack, you may get a stack overflow if you reserve too many vector cells.
And btw: your need for speed might make you optimize at the wrong place as allocating
some extra memory is not so time consuming as updating a plot with ui->plot_platform->graph(0)->addData(add_data); . Worth comparing with QWT (s. https://stackoverflow.com/a/24689918/4599792) ?
Now I have this question where I was asked the cost of deleting a value from a hash table when we used linear probing while the insertion process.
What I could figure out from reading various stuff on the internet is that it has to do something with the load factor. Though I am not sure, but I read a relation between the load factor and no of probes required and it is No of probes = 1 / (1-LF).
So I believe the cost has to be dependent on the probe sequence. But then another thought ruins everything.
What if the element was inserted in p probes and now I am trying to delete this element. But before this I had already deleted few elements having the same hash code and were a part of insertion in probes less than p.
In this case I reach to a stage where I see a slot empty in the hash table but I am not sure if the element I am trying to delete is already deleted or is at some other location as a result of probing.
I also found that once I delete an element I must mark this slot with some special indicator to inform that it is available, but this doesn't solve my problem of being uncertain about the element which I am willing to delete.
Could anyone please suggest how to find the cost in such cases?
Is the approach going to vary if it is non-linear probing?
The standard approach is "lookup the element, mark as deleted". Marking obviously has O(1) cost, so the total operation cost is the same as just lookup: O(1) expected. It can be as high as O(n) in degenerate cases (e.g. all elements have the same hash). O(1) expected is all we can say theoretically.
About the load factor. The higher the load factor (ratio of number of occupied buckets to the total number), the larger is the expected factor (but this doesn't change the theoretical O cost). Note that in this case load factor includes number of both present in the table elements plus the number of buckets that got marked as deleted previously.
Other probing kinds (e.g. quadratic) don't change the theoretical cost, but may alter the expected constant factor or its variance. If you look at "fallback" sequences, in linear ordering the sequences of different buckets overlap. This means that if for some bucket the sequence is long, for adjacent buckets it will also be long. E.g.: if buckets 4 to 10 are occupied, sequence for bucket #4 is 7 bucket long (4, 5, 6, ..., 10), for #5 it's 6 and so on. For quadratic probing this is not the case.
However, linear probing has the benefit of better memory-cache behavior, since you check memory cells close to each other. In practice, though, for quadratic probing fallback sequences are rarely long enough for this to matter.
Finally, in linear probing case, it is possible to work without deleted mark, but for this you'd have to complicate deleting procedure considerably (still O(1) expected, though, but with much higher constant factor). Whether it is worth it has to be decided with actual profiling; for example, this simplifies inserting somewhat and lookup a bit. For a C++ implementation this would have the downside that erase() would invalidate iterators, though.
I am using R tool to calculate SVD (svd(m)) and it works on small matrix but as I pass it 20Kx20X matrix. After processing, it gives the following error
Error in svd(m) : infinite or missing values in 'x'
I checked and there is no row or column with all 0 values and no duplicate in row and
column. All columns have values.
I cannot past 20Kx20K matrix here :(
I am guessing that your problem is not related to memory size, although I am not able to process a 20Kx20K matrix on my 4GB memory machine.
The reason for this guess is that the first line of code inside svd() is the following:
if (any(!is.finite(x)))
stop("infinite or missing values in 'x'")
In other words, the svd() function test first whether there are any infinite values in your data.
This happens before any further processing. So, if you had memory problems, these would be apparent even before your call to svd().
I suggest you check for infinite values:
x <- c(0, Inf, NA, NULL)
which(!is.finite(x))
[1] 2 3
This indicates that the second and third values are considered to be not finite. In other words, any NA values in your data will cause your error.
Possibly the svd calculation itself also uses a lot of memory. If we compare to MATLAB, we see that the svd calculation allocates just as much memory as the matrix itself uses, so if you already ise 3GB of memory, the svd calculation possibly allocates another 3GB, which gives 6GB of memory.
If you're storing doubles that are 8 bytes, 20Kx20K means 8*20,000*20,000/1024/1024 ~ 3GB of RAM to hold the whole thing in memory.
I don't know how much RAM you've got available, but I'd look into what R can do to serialize the matrix out to disk as needed.
Is the matrix sparse or banded? Can you do something to decrease the amount of memory you need?
How large is the null space for you matrix? What's the condition number (ratio of largest-to-smallest eigenvalue)? A large condition number can be an indication of difficulties in solving. A matrix need not have a zero row or column to be nearly singular.
UPDATE:
Based on your comment, I'd say that RAM is the least of your problems. Sounds like it's possible to hold the entire matrix in memory - if you can address it all. You can address the entire matrix. You're running on a 64-bit OS - is your version of R 64-bit as well?
Unfortunately, one of the byproducts of SVD is to get the size of the null space.
You can get the minimum eigenvalue for your matrix using Jacobi iteration. Lanczos might be a good choice for getting the maximum eigenvalue. It'd be a lot of work to get all of them; you might just want the five smallest and largest to assess.
Anytime I experience an error with some software I immediately paste it into a Google search. At least it's comforting to know that I'm not the first to experience a particular problem:
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=Error+in+svd(m)+:+infinite+or+missing+values+in+'x'