I've and Zope utility with a method that perform network processes.
As the result of the is valid for a while, I'm using plone.memoize.ram to cache the result.
MyClass(object):
#cache(cache_key)
def do_auth(self, adapter, data):
# performing expensive network process here
...and the cache function:
def cache_key(method, utility, data):
return time() // 60 * 60))
But I want to prevent the memoization to take place when the do_auth call returns empty results (or raise network errors).
Looking at the plone.memoize code it seems I need to raise ram.DontCache() exception, but before doing this I need a way to investigate the old cached value.
How can I get the cached data from the cache storage?
I put this together from several code I wrote...
It's not tested but may help you.
You may access the cached data using the ICacheChooser utility.
It's call method needs the dotted name to the function you cached, in your case itself
key = '{0}.{1}'.format(__name__, method.__name__)
cache = getUtility(ICacheChooser)(key)
storage = cache.ramcache._getStorage()._data
cached_infos = storage.get(key)
In cached_infos there should be all infos you need.
Related
I would like to implement a server with HTTP.jl and julia. After some computation the server would return a "large" file (about several 100 MB). I would like to avoid having to read all the file in memory and then send it to the client.
Some framework allow have a specific function for this (e.g. Flask http://flask.pocoo.org/docs/0.12/api/#flask.send_file) or allow to stream the content to the client (http://flask.pocoo.org/docs/0.12/patterns/streaming/).
Are one for these two options also available in HTTP.jl ? Or any other Julia web package?
Here is a test code which reads the file testfile.txt, but I want to avoid loading the complete file in memory.
import HTTP
f = open("testfile.txt","w")
write(f,"test")
close(f)
router = HTTP.Router()
function testfun(req::HTTP.Request)
f = open("testfile.txt")
data = read(f)
close(f)
return HTTP.Response(200,data)
end
HTTP.register!(router, "GET", "/testfun",HTTP.HandlerFunction(testfun))
server = HTTP.Servers.Server(router)
task = #async HTTP.serve(server, ip"127.0.0.1", 8000; verbose=false)
sleep(1.0)
req = HTTP.request("GET","http://127.0.0.1:8000/testfun/")
# end server
put!(server.in, HTTP.Servers.KILL)
#show String(req.body)
You can use memory mapped IO like this:
function testfun(req::HTTP.Request)
data = Mmap.mmap(open("testfile.txt"), Array{UInt8,1})
return HTTP.Response(200,data)
end
data now looks like a normal byte array to julia, but is actually liked to the file, which might be exactly what you want. The file will be closed upon garbage collection - if you have many requests and no garbage collection is triggered, you might end up with a lot of open files. If your request takes quite long anyway, you might consider calling gc() at the begin of the request.
The short version
How would I go about blocking the access to a file until a specific function that both involves read and write processes to that very file has returned?
The use case
I often want to create some sort of central registry and there might be more than one R process involved in reading from and writing to that registry (in kind of a "poor man's parallelization" setting where different processes run independently from each other except with respect to the registry access).
I would not like to depend on any DBMS such as SQLite, PostgreSQL, MongoDB etc. early on in the devel process. And even though I later might use a DBMS, a filesystem-based solution might still be a handy fallback option. Thus I'm curious how I could realize it with base R functionality (at best).
I'm aware that having a lot of reads and writes to the file system in a parallel setting is not very efficient compared to DBMS solutions.
I'm running on MS Windows 8.1 (64 Bit)
What I'd like to get a deeper understanding of
What actually exactly happens when two or more R processes try to write to or read from a file at the same time? Does the OS figure out the "accesss order" automatically and does the process that "came in second" wait or does it trigger an error as the file access might is blocked by the first process? How could I prevent the second process from returning with an error but instead "just wait" until it's his turn?
Shared workspace of processes
Besides the rredis Package: are there any other options for shared memory on MS Windows?
Illustration
Path to registry file:
path_registry <- file.path(tempdir(), "registry.rdata")
Example function that registers events:
registerEvent <- function(
id=gsub("-| |:", "", Sys.time()),
values,
path_registry
) {
if (!file.exists(path_registry)) {
registry <- new.env()
save(registry, file=path_registry)
} else {
load(path_registry)
}
message("Simulated additional runtime between reading and writing (5 seconds)")
Sys.sleep(5)
if (!exists(id, envir=registry, inherits=FALSE)) {
assign(id, values, registry)
save(registry, file=path_registry)
message(sprintf("Registering with ID %s", id))
out <- TRUE
} else {
message(sprintf("ID %s already registered", id))
out <- FALSE
}
out
}
Example content that is registered:
x <- new.env()
x$a <- TRUE
x$b <- letters[1:5]
Note that the content usually is "nested", i.e. RDBMS would not be really "useful" anyway or at least would involve some normalization steps before writing to the DB. That's why I prefer environments (unique variable IDs and pass-by-reference is possible) over lists and, if one does make the step to use a true DBMS, I would rather turn NoSQL approaches such as MongoDB.
Registration cycle:
The actual calls might be spread over different processes, so there is a possibility of concurrent access atempts.
I want to have other processes/calls "wait" until a registerEvent read-write cycle is finished before doing their read-write cycle (without triggering errors).
registerEvent(values=list(x_1=x, x_2=x), path_registry=path_registry)
registerEvent(values=list(x_1=x, x_2=x), path_registry=path_registry)
registerEvent(id="abcd", values=list(x_1=x, x_2=x),
path_registry=path_registry)
registerEvent(id="abcd", values=list(x_1=x, x_2=x),
path_registry=path_registry)
Check registry content:
load(path_registry)
ls(registry)
See filelock R package, available since 2018. It is cross-platform. I am using it on Windows and have not found a single problem.
Make sure to read the documentation.
?filelock::lock
Although the docs suggest to leave the lock file, I have had no problems removing it on function exit in a multi-process environment:
on.exit({filelock::unlock(lock); file.remove(path.lock)})
I saw a very strange behavior in my rebus handler which is self hosted in exe. Right after sending response using bus.send method it adds up some memory consumed by process. I tried to look up object graph using memory profile and found that rebus is holding response message in serialized format somewhere.
Object graph was showing below hierarchy to the root.
System.Message --> CachedBodyMessage --> stream
Give me some pointers if anybody is aware of this thing.
I understand that a memory leak is a grave concern, but my belief is that it is unlikely that Rebus should contain a memory leak.
This belief is rooted in the fact that I have been running Windows Service-hosted Rebus endpoints in production for 1,5 years now, and several of them (e.g. the timeout managers) have sometimes been running for several months without being restarted.
I'd like to be absolutely bulletproof sure though, so I'm willing to investigate the issue you're reporting.
You're mentioning "CachedBodyMessage" - judging by the names of fields inside System.Messaging.Message, it sounds like it's something within MSMQ. To try to reproduce your issue, I coded the following test:
[Test, Ignore("Only works in RELEASE mode because otherwise object references are held on to for the duration of the method")]
public void DoesNotLeakMessages()
{
// arrange
const string inputQueueName = "test.leak.input";
var queue = new MsmqMessageQueue(inputQueueName);
disposables.Add(queue);
var body = Encoding.UTF8.GetBytes(new string('*', 32768));
var message = new TransportMessageToSend
{
Headers = new Dictionary<string, object> { { Headers.MessageId, "msg-1" } },
Body = body
};
var weakMessageRef = new WeakReference(message);
var weakBodyRef = new WeakReference(body);
// act
queue.Send(inputQueueName, message, new NoTransaction());
message = null;
body = null;
GC.Collect();
GC.WaitForPendingFinalizers();
// assert
Assert.That(weakMessageRef.IsAlive, Is.False, "Expected the message to have been collected");
Assert.That(weakBodyRef.IsAlive, Is.False, "Expected the body bytes to have been collected");
}
which verifies that the sent transport message is collected as it should (will only do this in RELEASE mode though, because of the way DEBUG mode holds on to object references within scope)
I'll try and run the TimePrinter sample now and leave it running for a while to see if I can reproduce the issue. If you stumble upon more information about e.g. exactly which objects are leaking, it would be very helpful.
Thanks again for taking the time to report your worries to me :)
Followup:
I've modified the TimePrinter sample so that it sends 50 msg/s and includes a 64 KB random string payload with each message, and I've tracked the memory usage for almost four hours now. As you can see, it does not look like memory is being leaked.
I'll leave it running the rest of the day, just to be sure.
Maybe you can tell me some more about why you suspected there was a memory leak in the first place?
Update:
As you can see from the trace, it has now been running for 7 hours and thus more than 1,200,000 messages containing more than 70 GB of data has been sent and consumed by the same process. If cached message bodies were leaking, I am pretty sure that we would have been able to see something rising on the graph.
I know client side _underscore.js can be used to throttle click rates, but how do you throttle calls server side? I thought of using the same pattern but unfortunately _throttle doesn't seem to allow for differentiating between Meteor.userId()'s.
Meteor.methods({
doSomething: function(arg1, arg2){
// how can you throttle this without affecting ALL users
}
);
Here's a package I've roughed up - but not yet submitted to Atmosphere (waiting until I familiarize myself with tinytest and write up unit tests for it).
https://github.com/zeroasterisk/Meteor-Throttle
Feel free to play with it, extend, fix and contribute (pull requests encouraged)
The concept is quite simple, and it only runs (should only be run) on the server.
You would first need to come up with a unique key for what you want to throttle...
eg: Meteor.userId() + 'my-function-name' + 'whatever'
This system uses a new Collection 'throttle' and some helper methods to:
check, set, and purge records. There is also a helper checkThenSet
method which is actually the most common pattern, check if we can do something,
and the set a record that we did.
Usage
(Use Case) If your app is sending emails, you wouldn't want to send the same email over
and over again, even if a user triggered it.
// on server
if (!Throttle.checkThenSet(key, allowedCount, expireInSec)) {
throw new Meteor.Error(500, 'You may only send ' + allowedCount + ' emails at a time, wait a while and try again');
}
....
On Throttle Methods
checkThenSet(key, allowedCount, expireInSec) checks a key, if passes it then sets the key for future checks
check(key, allowedCount) checks a key, if less than allowedCount of the (unexpired) records exist, it passes
set(key, expireInSec) sets a record for key, and it will expire after expireInSec seconds, eg: 60 = 1 min in the future
purge() expires all records which are no longer within timeframe (automatically called on every check)
Methods (call-able)
throttle(key, allowedCount, expireInSec) --> Throttle.checkThenSet()
throttle-check(key, allowedCount) --> Throttle.check()
throttle-set(key, expireInSec) --> Throttle.set()
there is not built in support for this currently in meteor, but its on the roadmap https://trello.com/c/SYcbkS3q/18-dos-hardening-rate-limiting
in theory you could use some of the options here Throttling method calls to M requests in N seconds but you would have to roll your own solution
When i use loader.load(request); for the first time, my flex freeze for 10 secondes before posting the data (i can see the web server result in real time).
However if redo a similar POST with other data but same request.url, it's instantaneous.
// Multi form encoded data
variables = new URLVariables();
variables.user = "aaa";
variables.boardjpg = new URLFileVariable(data.boardBytes, "foo.jpg");
request = new URLRequestBuilder(variables).build();
request.url = "http://localhost:8000/upload/";
loader.load(request);
How can i see what is taking so long ?
Thanks !
Ok, this is an old question, anyway I find it searching for other things so quick adding this
URLFileVariables nor URLRequestBuilder are core classes in AS3, so I guess you're using some custom library to build your request. I don't know which library you use, but it seems that the purpose is to serialize some binary data to build a POST. Serializing usually takes some times the first time (lookup initialization and the like) and goes faster next, a well known example is Remoting in his different flavours