Validation Time Analysis [closed] - math

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 days ago.
Improve this question
The message validation server is able to validate two types of messages - type A and type B messages correspondingly. It is commonly believed that the validation time of type B message is somewhat longer than the validation of type A message. Additionally, it's commonly believed that the actual validation time depends largely only on the type of messages (A and B types) and doesn't depend in an essential amount on the message content. However, nobody hasn't checked these claims in practice.
We have reliable statistics for four different sessions:
1st session - 17,648 of type A messages and 11,414 type B messages were validated, session lasted 6 minutes and 7.90 seconds.
2nd session - 6836 of type A messages and 12,618 type B messages were validated, session lasted 4 minutes and 23.80 seconds.
3rd session - 24,616 of type A messages and 17,648 type B messages were validated, session lasted 8 minutes and 56.10 seconds.
4th session - 10,684 of type A messages and 12,684 type B messages were validated, session lasted 5 minutes and 7.78 seconds.
. Tasks:
1.Determine to what precision the validation time depends only on the type of message (A and B types), not the message content.
2.Determine the average validation times for both, type A and type B messages.
For this problem, I first assigned the time required for type A messages as x and type B as y. Then I wrote linear equations for each session.
Using the linear equations, I was able to determine that y < 8.98x. Which would indicate that y/x is somewhere between 1 and 8.98.
I also tried this approach:
B: no. of B type messages/total messages * time required.
I used this to get a rough estimation for the average time required for both type A and type B messages. I think finding the times in this way and taking the average solved the second question.
The first question has me stumped

Related

How does Raft handle a prolonged network partition?

Consider that we are running Raft on 3 machines: A, B, C and let A be the leader. There is a network partition that splits C, from A, B. Call the current term t. A and B remain on term 2, with no additional messages besides periodic heartbeats. At this time, C enters candidate state and increments term to 3, votes for itself, times out, and repeats. After say 10 cycles, the network partition is resolved. Now the state is A[2], B[2], C[12]; C will reject AppendEntries RPC from A as the term 2 is less than its current term, 10; C cannot assemble a quorum and will continue to run the leader election protocol as a candidate, and become increasingly more divergent from the current term value of A and B.
The question is then, how does Raft (or Raft-derived implementations) handle this issue? Some thoughts I had included:
Such a situation is an availability issue, rather than a safety violation. Ignore and let human operators handle by killing or resetting C
Exponential backoff to decrease the divergence of C per elections
Have C use lastApplied instead of currentTerm as the basis for rejecting or accepting the AppendEntries RPC. That is, we trust the log as the source of truth for terms, rather than currentTerm value. This is already used to ensure that C would not win as per the Election Restriction, however the paper seems to indicate that this "up-to-date" property is a grounds for not voting for C, but is not grounds for C to acquiesce and reset to a follower.
Note: terminology as per In Search of an Understandable Consensus Algorithm (Extended Version)
When C rejects an AppendEntries RPC from the leader A, it will return its now > 2 term. Raft replicas always recognize greater terms, so that in turn will cause the leader to step down and start a new election. Eventually, the cluster will converge on a new term that’s > 2 and which is >= C’s term.
This is an oft discussed (in the Raft dev community) somewhat inconvenient scenario that can cause unnecessary churn in Raft clusters. To guard against it, the Raft dissertation — and most real-world implementations — introduce and use the so-called “pre-vote protocol.” The pre-vote protocol essentially dictates that before becoming a candidate, a follower must first determine whether it can win an election by asking its peers. In the scenario you described above, C would ask for a pre-vote from A and B, and because of the network partition it would not receive any votes. So, C would never transition to the candidate role, never increment the term, and thus never present a term > 2 after the partition heals. Thus, you’ve eliminated the churn.
You can read more about the pre-vote protocol in Diego’s dissertation.

Call Detail Records (CDR) specifications [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
My question is about CDR (Call detail Records). I want to know more about that. I searched a lot (in Google and other sites) but unfortunately there is few reference and i couldn't find answere of my questions in none of them (Please share any reference you know and think will be useful)
I want to know...
1. Where is CDR element in network structures? i mean for example in LTE, it is connected to which elements? (S-GW, MME, HSS, PCRF.etc) (As i read about that, CDR is "mediation" but where is it in practical networks?..where should be?)
2. as i searched, i couldn't find any big company (Vendor) specific hardware which made for CDR..is there any specific hardware which most mobile network operators use?
3. is there any standard specification (not official but used by most) about CDR? (like interfaces, protocols, file formats, etc)
Thanks a lot
CDR is an "old" word that comes from old fixed networks where the only service was voice and Call Data Records were generated by the switch. By extension, today, CDR means any information generated by a network equipment. It can still be voice, or mobile data, or wifi, or SMS, etc Some times they are called also UDR, "U" for Usage Data Record.
The MSC generated CDR about : incoming calls, outgoing calls, transit calls, SMS traffic. Basically it says that number A has called the number B during S seconds, that the location of A is a given Cell ID and LAC, that the call has used some trunc, and so on. The is no information about the price, for example. The same for the "CDR" from SGSN or GGSN or MME where the usually provided information is location, type of (data) protocol used (TCP, UDP, ARP, HTTP, SMTP, ...), volume, etc. SMSC, USSD, and others also produce this kind of CDR. I use to call those CDRs "Traffic CDRs" as they describe the traffic information.
There are complementary to the "Charging CDRs" where the price information is produce. For example, for a voice call, the IN platform (sometimes called the OCS; Online Charging System) will generate CDRs with A number, B number, Call duration (which usually is different from the duration seen on the MSC), the accounts that had been used to pay the call, etc. Same hold for data, sms and all services charging. Those CDRs may also be used for offline billing.
I'm not aware of any standard. They are maybe specifications about what CDR produced by a given (standard) platform needs to produce but my (quite long) experience in the field says you should not rely on this but on the spec defined by the equipment vendor and your own test procedure.
This is where the mediation comes into the game. It's an IT system that is able to
get (or receive) unprocessed CDR files from all the network equipment
identify and filter out some unnecessary fields
sometimes aggregate some traffic CDRs in to one CDR
sometimes deduplicate some CDRs, or make sure that there is only one CDR per network event
eventually produce output files that will be used by other systems like billing or data warehouse
A CDR, Call or Charging Data Record, is actually just a record of the call's details - i.e. the name is literally correct.
You can think of it as a data structure or a group of data including the called number, calling number, duration, time of call, network nodes used etc.
It is used to feed billing systems, analytics, and simply to record details on calls, which can help with diagnosing problems for example.
It is not a node or a physical element itself, and once the CDRs are collected, for example on a Switch, they can be transferred and stored elsewhere.
All the big switching vendors, Nokia, Ericsson, Huawei etc will 'make' or generate the CDR's on their switches, as it is a basic requirement that operators demand.
The 3GPP origination defines the specification for CDR's - this covers areas like the structure of the CDR, the info the CDR contains and how CDRs are transferred between network elements. You can find the spec here:
https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1912

OpenvSwitch port missing in large load, long poll interval observed

ISSUE description
I have a OpenStack system with HA management network (VIP) via ovs (Open vSwitch) port, it's found in this system, with high load (concurrently volume-from-glance-image creation), the VIP port (an ovs port) will be missing.
Analysis
For now, with default log level from log file, the only thing observed is as below the Unreasonably long 62741ms poll interval.
2017-12-29T16:40:38.611Z|00001|timeval(revalidator70)|WARN|Unreasonably long 62741ms poll interval (0ms user, 0ms system)
Idea for now
I will turn debug log on for file and try reproducing the issue:
sudo ovs-appctl vlog/set file:dbg
Question
What else should I do during/after of the issue reproduction please?
Is this issue typical? Caused by what if yes?
I googled OpenvSwitch trouble shoot or other related key words while information was all on data flow/table level instead of this ovs-vswitchd level ( am I right? )
Many thanks!
BR//Wey
This issue was not reproduced and thus I forgot about it, until recently, 2 years afterward, I had a chance to get in touch with this issue in a different environment, and this time I have more ideas on its root cause.
It could be caused by the shift that comes in the bonding, for some reason, the traffic pattern fits the situation of triggering shifts again and again(the condition is quite strong I would say but there is a chance to be hit anyway, right?).
The condition of the shift was quoted as below and please refer to the full doc here: https://docs.openvswitch.org/en/latest/topics/bonding/
Bond Packet Output
When a packet is sent out a bond port, the bond member actually used is selected based on the packet’s source MAC and VLAN tag (see bond_choose_output_member()). In particular, the source MAC and VLAN tag are hashed into one of 256 values, and that value is looked up in a hash table (the “bond hash”) kept in the bond_hash member of struct port. The hash table entry identifies a bond member. If no bond member has yet been chosen for that hash table entry, vswitchd chooses one arbitrarily.
Every 10 seconds, vswitchd rebalances the bond members (see bond_rebalance()). To rebalance, vswitchd examines the statistics for the number of bytes transmitted by each member over approximately the past minute, with data sent more recently weighted more heavily than data sent less recently. It considers each of the members in order from most-loaded to least-loaded. If highly loaded member H is significantly more heavily loaded than the least-loaded member L, and member H carries at least two hashes, then vswitchd shifts one of H’s hashes to L. However, vswitchd will only shift a hash from H to L if it will decrease the ratio of the load between H and L by at least 0.1.
Currently, “significantly more loaded” means that H must carry at least 1 Mbps more traffic, and that traffic must be at least 3% greater than L’s.

Speed vs Bandwith, ISP's, misconception? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
A lot of ISP's sell their products saying: 100Mbit/s Speed.
However, compare the internet to a packet service, UPS for example.
The ammount of packages you can send every second(bandwith) is something different then the time it takes to arrive(speed).
I know there are multiple meanings of the term 'bandwith' so is it wrong to advertise with speed?
Wikipedia( http://en.wikipedia.org/wiki/Bandwidth_(computing) )
In computer networking and computer science, bandwidth,[1] network
bandwidth,[2] data bandwidth,[3] or digital bandwidth[4][5] is a
measurement of bit-rate of available or consumed data communication
resources expressed in bits per second or multiples of it (bit/s,
kbit/s, Mbit/s, Gbit/s, etc.).> In computer networking and computer
science, bandwidth,[1] network
bandwidth,[2] data bandwidth,[3] or digital bandwidth[4][5] is a
measurement of bit-rate
This part tells me that bandwith is measured in Mbit/s, Gbit/s.
So does this mean the majority of ISP's are advertising wrongly while they should advertise with 'bandwith' instead of speed?
Short answer: Yes.
Long answer: There are several aspects on data transfer that can be measured on an amount-per-time basis; Amount of data per second is one of them, but perhaps misleading if not properly explained.
From the network performance point of view, these are the important factors (quoting Wikipedia here):
Bandwidth - maximum rate that information can be transferred
Throughput - the actual rate that information is transferred
Latency - the delay between the sender and the receiver decoding it
Jitter - variation in the time of arrival at the receiver of the information
Error rate - corrupted data expressed as a percentage or fraction of the total sent
So you may have a 10Mb connection, but if 50% of the sent packages are corrupted, your final throughput is actually just 5Mb. (Even less, if you consider that a substantial part of the data may be control structures instead of data payload.
Latency may be affected by mechanisms such as Nagle's algorythm and ISP-side buffering:
As specified in RFC 1149, An ISP could sell you a IPoAC package with 9G bits/s, and still be true to its words, if they sent to you 16 pigeons with 32GB SD cards attached to them, average air time around 1 hour - or ~3,600,000 ms latency.

Why RIP(Routing Information Protocol ) uses hopcount of 15 hops? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm reading one of the Distance vector protocol RIP and come to know maximum hop count it uses is 15 hops but My doubt is why 15 is used as maximum Hop count why not some other number 10,12 or may be 8 ?
My guess is that 15 is 16 - 1, that is 2^4 - 1 or put it otherwise: the biggest unsigned value that holds in 4 bits of information.
However, the metric field is 4 bytes long. And the value 16 denotes infinity.
I can only guess, but I would say that it allows fast checks with a simple bit mask operation to determine whether the metric is infinity or not.
Now the real question might be: "Why is the metric field 4 bytes long when apparently, only five bits are used ?" and for that, I have no answer.
Protocols often make arbitrary decision. RIP is a very basic (and rather old protocol). You should keep that in mind when reading about it. As said above, the max hop count will be a 4 byte field, where 16 is equivalent to infinity. 10 is not a power of 2 number. 8 was probably deemed too small to reach all the routers.
The rationale behind keeping the maximum hop count low is the count to infinity problems. Higher max hop counts would lead to a higher convergence time. (I'll leave you to wikipedia count to infinity problem). Certain versions of RIP use split horizon, which addresses this issue).

Resources