Capping an Aerospike map in Lua - dictionary

We want to remove elements from Map bin based on size. There will be multiple threads which will try to do above operation. So writing an UDF to do this operation will make it synchronized between threads. But remove_by_rank_range is not working inside lua. Below is the error iwe are getting:
attempt to call field 'remove_by_rank_range' (a nil value)
sample lua code:
function delete(rec)
local testBinMap = rec.testBin
map.remove_by_rank_range(testBinMap, 0, 5)
end

The Lua map API does not include most of the operations of the Map data type, as implemented in the clients (for example, the Java client's MapOperation class).
The performance of the native map operations is significantly higher, so why would you use a UDF here, instead of calling remove_by_rank_range from the client?
The next thing to be aware of is that any write operation, whether it's a UDF or a client calling the map remove_by_rank_range method, first grabs a lock on the record. I answered another stackoverflow question about this request flow. Your UDF doesn't give any advantage to the problem you described over the client map operation.
If you want to cap the size of your map you should be doing it at the very same time you're adding new elements to the map. The two operations would be wrapped together with operate() - an insert, followed by the remove. I have an example of how to do this in rbotzer/aerospike-cdt-examples.

Related

What is the use case of firebase-queue sanitize?

I am experimenting with firebase-queue. I saw the option for sanitizing. It's described in the doc as
sanitize - specifies whether the data object passed to the processing
function is sanitized of internal keys reserved for use by the queue.
Defaults to true.
What does it mean?
I am getting an error for not specifying { sanitize : false }
When the sanitize option is set, the queue sanitizes (or cleans) the input provided to the processing function so that it resembles that which the original client placed onto the queue, and doesn't contain any of the keys added by the implementation of the queue itself.
If, however, you rely on a key (usually the keys starting with an underscore, e.g. _id) that is added by the queue, and not the original client, you need to set sanitize: false so those keys are returned to your function and they're not undefined.
You can clearly see the difference with a simple processing function that just performs a console.log(data).
A quick note about why these keys are removed by default: Reading or writing directly to the location (as it looks like you're perhaps doing, by passing undefined into the client SDK child() method instead of data._id) is generally a bad idea from within the worker itself as writes performed directly are not guarded by the extensive transaction logic in the queue to prevent race conditions. If you can isolate the work to taking input from the provided data field, and returning outputs to the resolve() function, you'll likely have a better time scaling up your queue.

How to Uniquely associate nested call with the original call?

I have function that is using unity interception to log the method time. The problem comes when I want to log deeper info like database call time, backend time etc.
So my method (let say, M1) calls some other method(M2) that inturn call some other method and so on to finally call a dbMethod that calls db. I am able to log time for all functions individually but for final aggregation in my log server , it would be helpful if I can find for which M1 call, how much time the dbServer Method took.
So is there some property like threadId that I can use that remain same during the nested calls so that I can use them in final aggregation (for joining M1 and dbMethod log data)? I would like that unique value to be different in different invokation of M1.
For anyone out there that is facing same problem, I solved it using CallContext class. Just add a request Id in outermost call and it will propagate to all the inner nested call.

Firebase child syntax performance

I'm starting a project and setting the code style guide. I like the second of the two below, but I'm wondering if there is a difference in the way these two statements are executed? Is the second one slower, or is the whole chain evaluated before any call to the database happens? (I only see a single frame in the socket)
Style 1:
db.child(`data/projects/${currentProject}/boxes/${newBoxId}`).set(true);
Style 2:
db
.child('data')
.child('projects')
.child(currentProject)
.child('boxes')
.child(newBoxId)
.set(true);
There is no performance difference between the two.
Calling child() does not require the client to connect to the server, it's a pure client-side operation. So you can do whatever is most readable in your code.
If I have a set of static segments in the path, I typically combine them in a single child() call. But when there is a dynamic segment, I prefer to put that into its own child() call to prevent having to concatenate strings.
ref.child('users').child(authData.uid).child('documents/passport').on('value'...

Is getting a value using range not thread-safe in Go?

When ranging over a map m that has concurrent writers, including ones that could delete from the map, is it not thread-safe to do this?:
for k, v := range m { ... }
I'm thinking to be thread-safe I need to prevent other possible writers from changing the value v while I'm reading it, and (when using a mutex and because locking is a separate step) verify that the key k is still in the map. For example:
for k := range m {
m.mutex.RLock()
v, found := m[k]
m.mutex.RUnlock()
if found {
... // process v
}
}
(Assume that other writers are write-locking m before changing v.) Is there a better way?
Edit to add: I'm aware that maps aren't thread-safe. However, they are thread-safe in one way, according to the Go spec at http://golang.org/ref/spec#For_statements (search for "If map entries that have not yet been reached are deleted during iteration"). This page indicates that code using range needn't be concerned about other goroutines inserting into or deleting from the map. My question is, does this thread-safe-ness extend to v, such that I can get v for reading only using only for k, v := range m and no other thread-safe mechanism? I created some test code to try to force an app crash to prove that it doesn't work, but even running blatantly thread-unsafe code (lots of goroutines furiously modifying the same map value with no locking mechanism in place) I couldn't get Go to crash!
No, map operations are not atomic/thread-safe, as the commenter to your question pointed to the golang FAQ “Why are map operations not defined to be atomic?”.
To secure your accessing it, you are encouraged to use Go's channels as a means of resource access token. The channel is used to simply pass around a token. Anyone wanting to modify it will request so from the channel - blocking or non-blocking. When done with working with the map it passes the token back to the channel.
Iterating over and working with the map should be sufficiently simple and short, so you should be ok using just one token for full access.
If that is not the case, and you use the map for more complex stuff/a resource consumer needs more time with it, you may implement a reader- vs writer-access-token. So at any given time, only one writer can access the map, but when no writer is active the token is passed to any number of readers, who will not modify the map (thus they can read simultaneously).
For an introduction to channels, see the Effective Go docs on channels.
You could use concurrent-map to handle the concurrency pains for you.
// Create a new map.
map := cmap.NewConcurretMap()
// Add item to map, adds "bar" under key "foo"
map.Add("foo", "bar")
// Retrieve item from map.
tmp, ok := map.Get("foo")
// Checks if item exists
if ok == true {
// Map stores items as interface{}, hence we'll have to cast.
bar := tmp.(string)
}
// Removes item under key "foo"
map.Remove("foo")

Lua producer-consumer pattern with consumers waiting for different data

The problem
One data source generating data in format {key, value}
Multiple receivers each waiting for different key
Example
Getting data is run in loop. Sometimes I will want to get next value labelled with key by using
Value = MyClass:GetNextValue(Key)
I want my code to stop there until the value is ready (making some sort of future(?) value). I've tried using simple coroutines, but they work only when waiting for any data.
So the question I want to ask is something like How to implement async values in lua using coroutines or similar concept (without threads)?
Side notes
The main processing function will, apart from returning values to waiting consumers, process some of incoming data (say, labeled with special key) itself.
The full usage context should look something like:
-- in loop
ReceiveData()
ProcessSpecialData()
--
-- Called outside the loop:
V = RequestDataWithGivenKey(Key)
How to implement async values
You start by not implementing async values. You implement async functions: you don't get the value back until has been retrieved.
First, your code must be in a Lua coroutine. I'll assume you understand the care and feeding of coroutines. I'll focus on how to implement RequestDataWithGivenKey:
function RequestDataWithGivenKey(key)
local request = FunctionThatStartsAsyncGetting(key)
if(not request:IsComplete()) then
coroutine.yield()
end
--Request is complete. Return the value.
return request:GetReturnedValue()
end
FunctionThatStartsAsyncGetting returns a request back to the function. The request is an object that stores all of the data needs to process the specific request. It represents asking for the value. This should be a C-function that starts the actual async getting.
The request will be either a userdata or an encapsulated Lua table that stores enough information to communicate with the C-code that's doing the async fetching. IsComplete uses the internal request data to see if that request has completed. GetReturnedValue can only be called when IsComplete returns true; it puts the value on the Lua stack, so that this function can return it.
Your external code simply needs to handle the async stuff internally. Between resumes of these Lua coroutines, you'll need to pump whatever async stuff is doing the fetching, if there are outstanding requests.

Resources