In previous date time API thread are not safe.... I want to know how they achieved in new date time API in java 8?? (earlear they also can safe the thread by using synchronizing and making seprate instance for each thread ) In java 8 what they add new give some examples also... Thank you.
The SimpleDateFormat that's existed since the early days of Java used inner fields to hold temporary state but didn't do anything to prevent two thread concurrently updating these. This lead to the wrong date being returned if two threads happened to call the format or parse methods on the same SimpleDateFormat instance at the same time, since they'd modify the internal state of the SimpleDateFormat object whilst the other was still using that state.
Java 8 hasn't done anything to change SimpleDateFormat, instead it's introduced a whole new LocalDate API that uses internal synchronization to protect fields being accessed concurrently (and possibly uses local variables to reduce locking overhead, but I've not checked this), as well as removing the complexity of Timezones and pre-1990 dates that were also a headache for users of the old Date APIs.
The thread safety in java.time (the modern Java date and time API introduced from Java 8) is obtained through immutable classes. An immutable object is always thread-safe (see the modification of the last statement near the bottom of the first link). As Holger notes in a comment,
without mutation, there can’t be any inconsistencies.
Links:
Does Immutability Really Mean Thread Safety?
Immutable objects are thread safe, but why?
Related
I am learning about F# agents (MailboxProcessor).
I am dealing with a rather unconventional problem.
I have one agent (dataSource) which is a source of streaming data. The data has to be processed by an array of agents (dataProcessor). We can consider dataProcessor as some sort of tracking device.
Data may flow in faster than the speed with which the dataProcessor may be able to process its input.
It is OK to have some delay. However, I have to ensure that the agent stays on top of its work and does not get piled under obsolete observations
I am exploring ways to deal with this problem.
The first idea is to implement a stack (LIFO) in dataSource. dataSource would send over the latest observation available when dataProcessor becomes available to receive and process the data. This solution may work but it may get complicated as dataProcessor may need to be blocked and re-activated; and communicate its status to dataSource, leading to a two way communication problem. This problem may boil down to a blocking queue in the consumer-producer problem but I am not sure..
The second idea is to have dataProcessor taking care of message sorting. In this architecture, dataSource will simply post updates in dataProcessor's queue. dataProcessor will use Scanto fetch the latest data available in his queue. This may be the way to go. However, I am not sure if in the current design of MailboxProcessorit is possible to clear a queue of messages, deleting the older obsolete ones. Furthermore, here, it is written that:
Unfortunately, the TryScan function in the current version of F# is
broken in two ways. Firstly, the whole point is to specify a timeout
but the implementation does not actually honor it. Specifically,
irrelevant messages reset the timer. Secondly, as with the other Scan
function, the message queue is examined under a lock that prevents any
other threads from posting for the duration of the scan, which can be
an arbitrarily long time. Consequently, the TryScan function itself
tends to lock-up concurrent systems and can even introduce deadlocks
because the caller's code is evaluated inside the lock (e.g. posting
from the function argument to Scan or TryScan can deadlock the agent
when the code under the lock blocks waiting to acquire the lock it is
already under).
Having the latest observation bounced back may be a problem.
The author of this post, #Jon Harrop, suggests that
I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
This idea is surely worth exploring but, before starting to play around with code, I would welcome some inputs on how I could structure my solution.
Thank you.
Sounds like you might need a destructive scan version of the mailbox processor, I implemented this with TPL Dataflow in a blog series that you might be interested in.
My blog is currently down for maintenance but I can point you to the posts in markdown format.
Part1
Part2
Part3
You can also check out the code on github
I also wrote about the issues with scan in my lurking horror post
Hope that helps...
tl;dr I would try this: take Mailbox implementation from FSharp.Actor or Zach Bray's blog post, replace ConcurrentQueue by ConcurrentStack (plus add some bounded capacity logic) and use this changed agent as a dispatcher to pass messages from dataSource to an army of dataProcessors implemented as ordinary MBPs or Actors.
tl;dr2 If workers are a scarce and slow resource and we need to process a message that is the latest at the moment when a worker is ready, then it all boils down to an agent with a stack instead of a queue (with some bounded capacity logic) plus a BlockingQueue of workers. Dispatcher dequeues a ready worker, then pops a message from the stack and sends this message to the worker. After the job is done the worker enqueues itself to the queue when becomes ready (e.g. before let! msg = inbox.Receive()). Dispatcher consumer thread then blocks until any worker is ready, while producer thread keeps the bounded stack updated. (bounded stack could be done with an array + offset + size inside a lock, below is too complex one)
Details
MailBoxProcessor is designed to have only one consumer. This is even commented in the source code of MBP here (search for the word 'DRAGONS' :) )
If you post your data to MBP then only one thread could take it from internal queue or stack.
In you particular use case I would use ConcurrentStack directly or better wrapped into BlockingCollection:
It will allow many concurrent consumers
It is very fast and thread safe
BlockingCollection has BoundedCapacity property that allows you to limit the size of a collection. It throws on Add, but you could catch it or use TryAdd. If A is a main stack and B is a standby, then TryAdd to A, on false Add to B and swap the two with Interlocked.Exchange, then process needed messages in A, clear it, make a new standby - or use three stacks if processing A could be longer than B could become full again; in this way you do not block and do not lose any messages, but could discard unneeded ones is a controlled way.
BlockingCollection has methods like AddToAny/TakeFromAny, which work on an arrays of BlockingCollections. This could help, e.g.:
dataSource produces messages to a BlockingCollection with ConcurrentStack implementation (BCCS)
another thread consumes messages from BCCS and sends them to an array of processing BCCSs. You said that there is a lot of data. You may sacrifice one thread to be blocking and dispatching your messages indefinitely
each processing agent has its own BCCS or implemented as an Agent/Actor/MBP to which the dispatcher posts messages. In your case you need to send a message to only one processorAgent, so you may store processing agents in a circular buffer to always dispatch a message to least recently used processor.
Something like this:
(data stream produces 'T)
|
[dispatcher's BCSC]
|
(a dispatcher thread consumes 'T and pushes to processors, manages capacity of BCCS and LRU queue)
| |
[processor1's BCCS/Actor/MBP] ... [processorN's BCCS/Actor/MBP]
| |
(process) (process)
Instead of ConcurrentStack, you may want to read about heap data structure. If you need your latest messages by some property of messages, e.g. timestamp, rather than by the order in which they arrive to the stack (e.g. if there could be delays in transit and arrival order <> creation order), you can get the latest message by using heap.
If you still need Agents semantics/API, you could read several sources in addition to Dave's links, and somehow adopt implementation to multiple concurrent consumers:
An interesting article by Zach Bray on efficient Actors implementation. There you do need to replace (under the comment // Might want to schedule this call on another thread.) the line execute true by a line async { execute true } |> Async.Start or similar, because otherwise producing thread will be consuming thread - not good for a single fast producer. However, for a dispatcher like described above this is exactly what needed.
FSharp.Actor (aka Fakka) development branch and FSharp MPB source code (first link above) here could be very useful for implementation details. FSharp.Actors library has been in a freeze for several months but there is some activity in dev branch.
Should not miss discussion about Fakka in Google Groups in this context.
I have a somewhat similar use case and for the last two days I have researched everything I could find on the F# Agents/Actors. This answer is a kind of TODO for myself to try these ideas, of which half were born during writing it.
The simplest solution is to greedily eat all messages in the inbox when one arrives and discard all but the most recent. Easily done using TryReceive:
let rec readLatestLoop oldMsg =
async { let! newMsg = inbox.TryReceive 0
match newMsg with
| None -> oldMsg
| Some newMsg -> return! readLatestLoop newMsg }
let readLatest() =
async { let! msg = inbox.Receive()
return! readLatestLoop msg }
When faced with the same problem I architected a more sophisticated and efficient solution I called cancellable streaming and described in in an F# Journal article here. The idea is to start processing messages and then cancel that processing if they are superceded. This significantly improves concurrency if significant processing is being done.
In the docs here:
http://www.sqlite.org/threadsafe.html
For serialized mode it says:
"In serialized mode, SQLite can be safely used by multiple threads with no restriction."
I want to make sure I understand the guarantee presented there. If a single database connection is opened using the "SQLITE_OPEN_FULLMUTEX" flag and two threads simultaneously try to call sqlite3_exec at the exact same instant, does Sqlite automatically serialize the calls?
The answer is yes. sqlite3_exec() will grab a mutex when the function is entered and releases the mutext once the function is left. Only one thread can own the mutex at any given time, so only one thread can execute sqlite3_exec() at any given time. If two threads try to execute sqlite3_exec() at the same time, one will wait for the other.
While debugging a classic ASP application (and learning about classic ASP at the same time) I've encountered the following
Application("Something") = "some value"
and elsewhere in the code this value gets used thus:
someObj.Property = Session("Something")
How does the Application object relate to Session?
A Session variable is linked to a user. An Application variable is shared between all users.
Application is a handy vault for storing things you want to persist but you can't guarantee they'll always be there. So think low-end caching, short-term variable storage, etc.
In this context with these definitions, they have very little to do with each other except that getting and setting variables is roughly the same for each.
Note: there can be concurrency issues when using Application (because you could easily have more than one user hitting something that reads or writes to it) so I suggest you use Application.Lock before you write and Application.Unlock after you're done. This only really applies to writing.
Note 2: I'm not sure if it automatically unlocks after the request is done (that would be sensible) but I wouldn't trust it to. Make sure that any part of the application that could conceivable explode isn't within a lock otherwise you might face locking other users out.
Note 3: In that same vein, don't put things that take a long time to process inside a lock, only the bit where you write the data. If you do something that takes 10 seconds while in a lock, you lock everybody else out.
I have the feeling that I should not care about thread safe accessing / writing to an
public static int MyVar = 12;
in ASP .NET.
I read/write to this variable from various user threads. Let's suppose this variable will store the numbers of clicks on a certain button/link.
My theory is that no thread can read/write to this variable at the same time. It's just a simple variable of 4 bytes.
I do care about thread safe, but only for refference objects and List instances or other types that take more cycles to read/update.
I am wrong with my presumption ?
EDIT
I understand this depend of my scenario, but wasn't that the point of the question. The question is: it is right that can be written thread safe code with an (static int) variable without using lock keyword ?
It is my problem to write correct code. The answer seems to be: Yes, if you write correct and simple code, and not to much complicated, you can create thread safe functions without the need of lock keyword.
If one thread simply sets the value and another thread reads the value, then a lock is not necessary; the read and write are atomic. But if multiple threads might be updating it and are also reading it to do the update (e.g., increment), then you definitely do need some kind of synchronization. If only one thread is ever going to update it even for an increment, then I would argue that no synchronization is necessary.
Edit (three years later) It might also be desirable to add the volatile keyword to the declaration to ensure that reads of the value always get the latest value (assuming that matters in the application).
The concept of thread 'safety' is too vague to be meaningful unfortunately. If you're asking whether you can read and write to it from multiple threads without the program crashing during the operation, the answer is almost certainly yes. If you're also asking if the variable is guaranteed to either be the old value or the new value without ever storing any broken intermediate values, the answer for this data type is again almost certainly yes.
But if your question is "will my program work correctly if I access this from multiple threads", then the answer depends entirely on what your program is doing. For example, if you run the following pseudo code in 2 threads repeatedly in most programming languages, eventually you'll hit the assertion.
if MyVar >= 1:
MyVar = MyVar - 1
assert MyVar >= 0
Primitives like int are thread-safe in the sense that reads/writes are atomic. But as with most any type, it's left to you to do proper checking with more complex operations. For example, if (x > 0) x--; would be problematic in a multi-threaded scenario because x might change in between the if condition check and decrement.
A simple read or write on a field of 32 bits or less is always atomic. But you should provide your read/write code to make sure that it is thread safe.
Check out this post: http://msdn.microsoft.com/en-us/magazine/cc163929.aspx
It explains why you need to synchronize access to the integers in this scenario
Try Interlocked.Increment() or Interlocked.Add() and you'll be right. Your code complexity will be the same but you truly won't have to worry. If you're not worried about losing a few clicks in your counter, you can continue as you are.
Reading or writing integers is atomic. However, reading and then writing is not atomic. So, if you have one thread that writes and many that read, you may be able to get away without locks.
However, even though the operations are atomic, there are still potential multi-threading issues. In order for one thread to be guaranteed that another thread can see values it writes, you need a memory barrier. Otherwise, the compiler can optimize the code so that the variable stays in a register (or even optimize the operation away completely), so changes would be invisible from one thread to another.
You can establish a memory barrier explicitly (volatile or Thread.MemoryBarrier), or with the Interlocked class -- or with the lock statement (Monitor).
Running into a prickly problem with our web app here. (Asp.net 2.0 Win server 2008)
Our memory usage for the website, grows and grows even though I would expect it to remain at a fairly static level. (We have a small amount of data that gets stored in state).
Wanting to find out what the problem is, I've run a System.GC.Collect(); a few times, taken a memory dump and then loaded this memory dump into WinDbg.
When I do a DumpHeap -Stat I get an inordinately large number on particular type hanging around in memory.
0000064280580b40 713471 79908752 PaymentOption
so, doing a DumpHeap -MT for this type, I get a stack of object references. Picking a random number of these, I do a !gcroot and the command comes back reporting that no references are held to it.
To me, this is exactly when the GC should collect these items, but for some reason they have been left outstanding.
Can anybody offer an explanation as to what might be happening?
You could try using sosex.dll in Windbg, which is an extension written to help with .NET debugging. There is a command named !refs which is similar to !gcroot, in that it will show you all the objects referencing an object, plus it will show all the objects that it too is referencing.
In the example on the author's website, !refs is used against an object and the output looks like this:
0:000> !refs 0000000080000db8
Objects referenced by 0000000080000db8 (System.Threading.Mutex):
0000000080000ef0 32 Microsoft.Win32.SafeHandles.SafeWaitHandle
Objects referencing 0000000080000db8 (System.Threading.Mutex):
0000000080000e08 72 System.Threading.Mutex+<>c__DisplayClass3
0000000080000e50 64 System.Runtime.CompilerServices.RuntimeHelpers+CleanupCode
Few things:
GC.Collect won't help you do any debugging. The garbage collector is already being called: if any objects were available for collection it would have happened already.
Idle memory on a server is wasted memory. Are you sure memory is being 'leaked', or is it just that the framework is deciding it can keep more things in memroy or keep more memory around for faster access? In this case I suspect you are leaking memory, but it's something to double check for.
It sounds like something you don't expect is keeping a reference to PaymentOption objects. Perhaps a static collection somewhere? Or separate thread?
Does PaymentObject implement a finalizer by any chance? Does it call a STA COM object?
I'd be curious to see the output of !finalizequeue to see if the count of objects that are showing up on the heap are roughly the amount of any that might waiting to be finalized. Output should probably look something like this:
generation 0 has 57 finalizable objects (0409b5cc->0409b6b0)
generation 1 has 55 finalizable objects (0409b4f0->0409b5cc)
generation 2 has 0 finalizable objects (0409b4f0->0409b4f0)
Ready for finalization 0 objects (0409b6b0->0409b6b0)
If the number of Ready for finalization objects continues to grow, and your certain garbage collections are occuring (confirm via perfmon counters), then it might be a blocked finalizer thread. You might need to take several snapshots over the lifetime of the process (before a recycle) to confirm. I usually rely on the magic number of three, as long as the site is under some sort of load.
A bug in a finalizer can block the finalizer thread and prevent the objects from ever being collected.
If the PaymentOption object calls a legacy STA COM object, then this article ASP.NET Hang and OutOfMemory exceptions caused by STA components might point in the right direction.
Not without more info on your application. But we ran into some nasty memory problems a long time ago. Do you use ASP.NET caching? As Raymond Chen likes to say, "poor caching strategy is indisitinguishable from a memory leak."
Check out another tool - CLRProfiler.exe - it will help you traverse object reference trees to see where your objects are rooted. This is also good: link text
You've heard this before - if you have to GC.Collect, something is wrong.
Is the PaymentOption object created in an asynchronous process, by any chance? I remember something about, if you don't call EndInvoke, you can get problems like this.
I've been investigating the same issue myself and was asking why objects that had no references were not being collected.
Objects larger than 85,000 bytes are stored on the Large Object Heap, from which memory is freed up less frequently.
http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
A single PaymentOption may not be that big, but are they contained within collections, or are they based on something like a DataSet? You should pick on few instances of the PaymentOption / collection / DataSet and then use the sos !objsize command to see big they are.
Unfortunately this doesn't really answer the question. I like to think I can trust the .net framework to take care of releasing unused memory whenever it needs to. However I see a lot of memory being used by the worker process running the app I am looking at, even when memory looks quite tight on the server.
FYI, SOS in .NET 4 supports a few new commands that might be of assistance, namely !gcwhere (locate the generation of an objection; sosex's gcgen) and !findroots (does what it says on the tin; sosex's !refs)
Both are documented on the SOS documentation and mentioned on Tess Ferrandez's blog.