Why some people say Riak is eventual consistent [closed] - riak

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
In Riak by default data bucket are replicated to 3 different nodes (N=3). Also number of replicas that must respond to a read or write request before it is considered successful are 2 nodes (R=2, W=2).
We know when N is small or equal to R+W, Riak provides strong consistency. So by these default value Riak provides a strong consistency (like MySQL). I can't underestand why some people tell Riak is not strong consistency and it provides an eventual consistency?

R and W values (together with PW (primary write), PR (primary write) and DW (durable write)) allows you to tune consistency according to your application's requirements. Even though this can guarantee consistency during normal operation, Riak is still eventually consistent due to how it handles failure scenarios like e.g. network partitioning.
If we assume we have a cluster of 5 nodes with N set to 3 and 2 nodes are partitioned from the rest for some period of time, all nodes will still be able to accept both read and writes according to the previously mentioned parameters. If we further assume PR and PW are set to 0, records can be updated on both sides of the partition while it is in place. This can cause inconsistencies that can not be resolved until the cluster connectivity is restored, thus making the system eventually consistent.
PR and PW allows the user to specify that a certain number of primary partition owners must be present in order for the read/write is to succeed, and if these are set to a value > 0, it is possible that a partitioned cluster may not allow reads/writes to all objects on all nodes during the time the network partition persists.

A better term for what Riak and similar systems offers might be tunable consistency, since they offer the ability to make decisions about the tradeoffs between consistency, availability, and partition tolerance. MySQL and other RDBMS systems, by contrast, go to great lengths to guarantee consistency. This is an expensive guarantee that isn't always worth the cost.

Related

BDD Story style [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
We using Behaviour Driven Development to develop a SOA system using Scrum and have come across two approaches to producing the stories.
Approach 1
Given Specific Message Type is available
And Specific State exists
When the Message is processed
Then expected resulting state exists
Approach 2
Given a Specific state exists
When Specific Message Type is processed
Then expected resulting state exists
Few if any of the examples available are applied to testing SOA systems. I would appreciate any experiences of these or any insights on the consequences of each approach.
We are aiming for declarative rather than imperative stories. The message arrival in the first has a slightly imperative feel but I'm not confident the second approach covers acceptance criteria adequately, because it doesn't seem to account for the event driven nature of the SUT.
The aim of the story is to communicate with your customer, so whatever style promotes that goal is best - and that will vary from one team to another. I might prefer 'when some business event occurs' rather than your suggestions, but I don't know your team! Beware of trying to find a 'one-size-fits-all' template, use whatever communicates best for each situation. And the heart of agile is the ability to adapt - try one style and feel free to adapt if it doesn't seem to be working.
Whenever I'm producing a library or service, I often find it useful to provide an example of the kind of scenario which a service user might want. So for instance:
Given the server has information about risk limits for Lieumoney Brothers
And we are $2m from those limits
When we process EOD orders that take us to $1m for those limits
Then our status with Lieumoney Brothers should be Amber.
The actual contents of the message can then reflect the interaction with those particular sums and that particular counterparty. You can use this for lots of different domains, and your approach will depend on the domain and whether the availability of a message is unusual, for that domain. In the above example where you're trading millions then having risk information for a particular counterparty might be valuable, and if that's the state, it's worth calling out separately. It's probably less important if you're buying baby rabbits, for example.
Given "Rotweiller Pets Limited" is trading baby rabbits $2 cheaper than anyone else
When we ask the system to order 15 baby rabbits
Then it should place an order with "Rotweiller Pets Limited".
It's hard to discuss this without specific examples. However, you can probably see how providing those scenarios would then act as documentation for how to use your APIs, even if the underlying automation for those scenarios talks directly to the API, and has nothing actually specific for pets or Lieumoney trades.

Have anyone tried to break a bit even smaller? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I was reading a book regarding to learn more about ASM, and the author happened to commented on bits, the exact quote is:
A bit is the indivisible atom of information. There is no half-a-bit, and no bit-and-a-half. (This has been tried. It works badly. But that didn't stop it from being tried.)
My question is when have this been tried? What was the outcome? How did it go badly? It bothering me that google isn't helping me find the answer to this question regarding on the cases when someone tried to make a half a bit and use(?) it.
Thank if you can find out when this case happened.
Yes. That's what arithmetic coding (a type of compression) is about. It allows information to be stored in fractional bits.
I believe that in the specific example you're talking about, that the author was merely being tongue in cheek, and not referring to any actual attempt to split bits.
A bit, as defined by present day computers, is a binary value 0 or 1. That is the 'atom' of information, because in binary logic you cannot represent anything other than that using a single 'bit' - to represent anything else, like 0.5, you need more 'bits'.
However for multilevel electronics, the 'bit', would have multiple values. If someone makes a computer, which has electronics where each 'bit' can take value between 0-9, then you have a bit that can store more than just 0/1. Perhaps the author meant this. Attempts to make computers with multi level bits have failed, 'miserably'. Electronics has not been able to figure out how to do that, in a reliable/cost effective fashion. e.g. if someone can figure that out, then say a 1024 bits memory would have a single cell, the cells taking on a value ranging from 0-1023 to signify the value. That chip would then by 1024 times smaller than the current chips (just theoretically - if everything else remains the constant).
Though admittedly at a physical level, a bit would still remain as a bit. That is 1 wire going into a chip. That is 1 gate input. That is 1 memory cell. If you divide that 1 wire, 1 input, or that one cell into two, you get two wires/inputs/cells, NOT half wire/input/cell. So you get two bits.
I believe the author tries to state a metaphysical fact with humour.
Data is commonly stored using multilevel voltages in magnetic discs and flash memory. However one can calculate the "optimal" base of a number system being 'e=exp(1)=~2.718...', which AFAIK hasn't been "tried", while ternary (base-3) system is quite common in fast parallel arithmetic algorithms and it works better than base-2 in many applications.
Also, as omnifarious states, arithmetic/range encoding can be seen as a method of using fractional bits: e.g. if there are only three possible messages (e.g. 001, 010, 100), those can be stored in two bits "leaving one quarter of the space" unused.

VB.NET Application Blocks [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
What are the limitations of using ApplicationBlocks (An Introduction and Overview of the Microsoft Application Blocks) for ASP.NET/VB.NET applications? I have found lots of websites that talk about the benefits e.g. divorcing the data tier from the web tier, but I cannot find a web page that discusses the limitations.
I don't think you can really get a plain list of disadvantages. Microsoft Enterprise Library is a good library, well documented, rich and with tons of features.
You should change your question to "When I do not need to use it?". Of course this question should be repeated for each block. I'll try to summarize a little bit.
For every block you should consider to do not use the library when you do not need its complexity. Features doesn't come without a cost and the most obvious one is complexity (first in deploying and configuration). If you have to document and your user have to change application's configuration you may need to provide some tool or a lot of documentation. Complexity can be hidden in code too, even if EL designers tried to make everything easy it can't be as easy as a raw solution.
Second important disadvantage is obviously the speed. Layers of abstraction can't be for free and you'll pay a speed cost for that. In some cases you may do not care (simple Logging, for example) but it may be a problem in other cases (so again the answer is "it depends"). Think, for example, to Unity Application Block: you'll get all the power of injection but you'll pay a great cost for this.
So when you should use it? In my opinion a big goal of this library is that you do not need to use it all together. You can pick blocks you need when you need them. It's very common, for example, to use Logging and Exception Handling but you may not need Unity in whole your life. The Data Access Application Block is a very thin layer above ADO, it simplifies many common tasks but you do not gain the level of abstraction you have, for example, with LINQ to SQL or Entities (hey, do not forget they have very different purposes) and you should consider to use it only if everything else can't suit what you need.
So, finally, in my opinion you should consider each block and use it only and only if you really need all the complexity that comes with an Enterprise level library. It's not designed for small single user applications (even if sometimes you may find that some Application Blocks may be great for a specific task). Its drawbacks aren't little: complexity and speed can't be ignored (both for short term solution and long term maintenance plans). Moreover if you really need all its power you'll find it's not as easy as a "ready-to-go" solution, to have the control you'll need to write more code.

What is a address space of process? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
Please any body tell what is meant by address space ?
why it is called like that ?
and also about the virtual memory ?
Thanks in advance
Regards
Pavankumar
I think address space refers to a segment.
In real mode (intel's XT and 286) segment is just a way to make a program independent of it's space in memory. When a program gets compiled the addresses (of varables, labels - functions) are hardcoded into a program. - This way it would be difficult to load two programs at the same time, because they would all want to use the same addresses.
We need to use relative addresses instead of absolute ones. The resolution between the relative and physical addresses are made relative to segments. If one program is loaded starting from the segment 0x200 and another program is loaded starting from 0x600 they can freely use the same address (for example 0x41) because that will be relative to their respective segments. In our case (real mode) the segment 0x200 will be translated to physical address 0x2000 (through multiplying it by 0x10) and after adding the relative address, the resulting physical address will be 0x2041.
There are many segments which can be used. Data operations by default are made relative to the program's Data Segment (held in the DS register of the cpu) and code operations are made relative to Code Sement (held in the CS register). Stack addresses are resolved to physical addresses using the Stack Segment (SS register).
But in real mode you can freely use the segments, you can access other program's segments or enter arbitrary values which will be resolved to arbitrary physical addresses.
In protected mode the whole concept changed. Segments do not hold addresses any more. They hold selectors. They only refer to an element in a table, where the real base addresses are held. The table also contains limits, so you can no longer address ANY physical address, only inside the portion of memory which was given to your program by the OS. This introduces the concept of ownership of memory blocks by processes.
I think this is enough for the start, feel free to read more on either Wikipedia or other good sources. The topic is quite documented.

What are some ways to optimize your use of ASP.NET caching? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have been doing some reading on this subject, but I'm curious to see what the best ways are to optimize your use of the ASP.NET cache and what some of the tips are in regards to how to determine what should and should not go in the cache. Also, are there any rules of thumb for determining how long something should say in the cache?
Some rules of thumb
Think in terms of cache miss to request ratio each time you contemplate using the cache. If cache requests for the item will miss most of the time then the benefits may not outweigh the cost of maintaining that cache item
Contemplate the query expense vs cache retrieval expense (e.g. for simple reads, SQL Server is often faster than distributed cache due to serialization costs)
Some tricks
gzip strings before sticking them in cache. Effectively expands the cache and reduces network traffic in a distributed cache situation
If you're worried about how long to cache aggregates (e.g. counts) consider having non-expiring (or long-lived) cached aggregates and pro-actively updating those when changing the underlying data. This is a controversial technique and you should really consider your request/invalidation ratio before proceeding but in some cases the benefits can be worth it (e.g. SO rep for each user might be a good candidate depending on implementation details, number of unanswered SO questions would probably be a poor candidate)
Don't implement caching yet.
Put it off until you've exhausted all the Indexing, query tuning, page simplification, and other more pedestrian means of boosting performance. If you flip caching on before it's the last resort, you're going to have a much harder time figuring out where the performance bottlenecks really live.
And, of course, if you have the backend tuned right when you finally do turn on caching, it will work a lot better for a lot longer than it would if you did it today.
The best quote i've heard about performance tuning and caching is that it's an art not a science, sorry can't remember who said it but the point here is that there are so many factors that can have an effect on the performance of your app that you need to evaluate each situation case by case and make considered tweaks to that case until you reach a desired outcome.
I realise i'm not giving any specifics here but I don't really think you can
I will give one previous example though. I worked on an app that made alot of calls to webservices to built up a client profile e.g.
GET client
GET client quotes
GET client quote
Each object returned by the webservice contributed to a higher level object that was then used to build the resulting page. At first we gathered up all the objects into the master object and cached that. However we realised when things were not as quick as we would like that it would make more sense to cache each called object individually, this way it could be re-used on the next page the client sees e.g.
[Cache] client
[Cache] client quotes
[Cache] client quote
GET client quote upgrades
Unfortunately there is no pre-established rules...but to give you a common sense, I would say that you can easily cache:
Application Parameters (list of countries, phone codes, etc...)
Any other application non-volatile data (list of roles even if configurable)
Business data that is often read and does not change much (or not a big deal if it is not 100% accurate)
What you should not cache:
Volatile data that change frequently (usually the business data)
As for the cache duration, I tend to use different durations depending on the type of data and its size. Application Parameters can be cached for several hours or even days.
For some business data, you may want to have smaller cache duration (minutes to 1h)
One last thing is always to challenge the amount of data you manipulate. Remember that the end-user won't read thousands of records at the same time.
Hope this will give you some guidance.
It's very hard to generalize this sort of thing. The only hard-and-fast rule to follow is not to waste time optimizing something unless you know it needs to be done. Then the proper course of action is going to be very much dependent on the nitty gritty details of your application.
That said... I'll almost always cache global applications parameters in some easy to use object. This is certainly more of a programming convenience rather than optimization.
The one time I've written specific data caching code was for an app that interfaced with a very slow accounting database, and then it was read-only for data that didn't change very often. All writes went to the DB. With SQL Server, I've never run into a situation where the built-in ASP.NET-to-SQL Server interface was the slow part of the equation.

Resources