I've heard that Quantum Computing can model systems with large numbers of states. For example, if I understood the article cited below, a Quantum Computer with 50 qubits can simultaneously model 2^50 possible states. Moreover,
Quantum computers, which can exploit purely quantum features such as
superpositions and entanglement, should be able to efficiently produce
a series of samples...For classical computers, however, there’s no
known fast algorithm for generating these samples.
Quantum Supremacy Is Coming: Here’s What You Should Know
In testing software, we first have to generate samples. Could a series of qubits instantly generate those samples...and test if the system under test handles them correctly?
Within software, Redux applications in particular are easy to test because you can initialize them with any state you want, but there are a lot of them.
Since Redux applications have large numbers of states, I am wondering, can Quantum Computers be used to test Redux applications by simultaneously representing the many states in the Redux store?
How do I model the Redux states with qubits?
How can the Quantum Computer simultaneously test all of those states?
How do I tell the Quantum Computer whether a test passed?
Related
While reading the Spotify blog, I found a reference to something called "synthetic testing":
Having synthetic tests reduces time to recover
After this work involving timelines, we got some signals on time to recover. One such signal was that TTR was all over the place and genuinely hard to correlate with any single aspect of our systems.
However, we got a hit. One of the more exciting things we learned through our incident study was that synthetic testing works. We spent a fair amount of time grading whether or not a synthetic test would have plausibly detected outages, and then looked at the TTR for those that were in fact detected by synthetic tests, versus those that were not because they were not covered by a synthetic test.
The results were even more striking than we thought. We found that incidents involving coverable features that did have a synthetic test saw a recovery time that was generally 10 times faster. No really, read it again!
This may seem obvious, but we never want to discount the power of data to drive decisions. This isn’t just a curiosity. We’ve adjusted our priorities to put a greater emphasis on synthetic testing, as we think it’s pretty important to get things back up and running as quickly as possible.
What is a synthetic test, and how it differs from the normal software testing (unit, integration, ...) that is running in a CI?
I am researching about Peer-To-Peer network architecture for games.
What i have read from multiples sources is that Peer-To-Peer model makes it easy for people to hack. Sending incorrect data about your game character, whether it is your wrong position or the amount of health point you have.
Now I have read that one of the things to make Peer-To-Peer more secure is to put an anti-cheat system into your game, which controls some thing like: how fast has someone moved from spot A to spot B, or controls if someones health points did not change drastically without a reason.
I have also read about Lockstep, which is described as a "handshake" between all the clients in Peer-to-Peer network, where clients promise not to do certain things, for instance "move faster than X or not to be able to jump higher than Y" and then their actions are compared to the rules set in the "handshake".
To me this seems like an anti-cheat system.
What I am asking in the end is: What is Lockstep in Peer-To-Peer model, is it an Anti-Cheat system or something else and where should this system be placed in Peer-To-Peer. In every players computer or could it work if it is not in all of the players computer, should this system control the whole game, or only a subset?
Lockstep was designed primarily to save on bandwidth (in the days before broadband).
Question: How can you simulate (tens of) thousands of units, distributed across multiple systems, when you have only a vanishingly small amount of bandwidth (14400-28800 baud)?
What you can't do: Send tens of thousands of positions or deltas, every tick, across the network.
What you can do: Send only the inputs that each player makes, for example, "Player A orders this (limited size) group ID=3 of selected units to go to x=12,y=207".
However, the onus of responsibility now falls on each client application (or rather, on developers of P2P client code) to transform those inputs into exactly the same gamestate per every logic tick. Otherwise you get synchronisation errors and simulation failure, since no peer is authoritative. These sync errors can result from a great deal more than just cheaters, i.e. they can arise in many legitimate, non-cheating scenarios (and indeed, when I was a young man in the '90s playing lockstepped games, this was a frequent frustration even over LAN, which should be reliable).
So now you are using only a tiny fraction of the bandwidth. But the meticulous coding required to be certain that clients do not produce desync conditions makes this a lot harder to code than an authoritative server, where non-sane inputs or gamestate can be discarded by the server.
Cheating: It is easy to see things you shouldn't be able to see: every client has all the simulation data available. It is very hard to modify the gamestate without immediately crashing the game.
I've accidentally stumbled across this question in google search results, and thought I might as well answer years later. For future generations, you know :)
Lockstep is not an anti-cheat system, it is one of the common p2p network models used to implement online multiplayer in games (most notably in strategy games). The base concept is fairly straightforward:
The game simulation is split into fairly short time frames.
After each frame players collect input commands from that frame and send those commands over the network
Once all the players receive the commands from all the other players, they apply them to their local game simulation during the next time frame.
If simulation is deterministic (as it should be for lockstep to work), after applying the commands all the players will have the same world state. Implementing the determinism right is arguably the hardest part, especially for cross-platform games.
Being a p2p model lockstep is inherently weak to cheaters, since there is no agent in the network that can be fully trusted. As opposed to, for example, server-authoritative network models, where developer can trust a privately-owned server that hosts the game. Lockstep does not offer any special protection against cheaters by itself, but it can certainly be designed to be less (or more) vulnerable to cheating.
Here is an old but still good write-up on lockstep model used in Age of Empires series if anyone needs a concrete example.
I'm trying to implement my own version of the Asynchronous Advantage Actor-Critic method, but it fails to learn the Pong game. My code was mostly inspired by Arthur Juliani's and OpenAI Gym's A3C versions. The method works well for a simple Doom environment (the one used in Arthur Juliani's code), but when I try the Pong game, the method diverges to a policy where it always executes the same action (always move down, or always move up, or always executes the no-op action). My code is located in my GitHub repository.
I have already adapted my network to resemble the architecture used by OpenAI Gym's A3C version, which is:
4 convolutional layers with the same specs, those being: 32 filters, 3x3 kernels, 2x2 strides, with padding (padding='same'). The output of the last convolutional layer is then flattened and fed to a LSTM layer with an output of size 256. The initial states C and H of the LSTM layer are given as an input. The output of the LSTM layer is then separated into two streams: a fully connected layer with an output size equals to the number of actions (policy) and another fully connected layer with only one output (value function) (more details in Network.py of my code);
The loss function used is just as is informed in the original A3C paper. Basically, the policy loss is the log_softmax of the linear policy times the advantage function. The value loss is the square of the difference between the value function and the discounted rewards. The total loss accounts for the value loss, policy loss, and the entropy. The gradients are clipped to 40 (more details in Network.py of my code);
There is only one global network and several worker networks (one network for each worker). Only the global network is updated. This update is done with respect to the local gradients of each worker network. Therefore, each worker simulate the environment for BATCH_SIZE iterations, saving the state, value function, chosen action, reward received, and the LSTM state. After BATCH_SIZE (I used BATCH_SIZE = 20) iterations, each worker pass those data into the network, calculate the discounted rewards, the advantage function, the total loss, and the local gradients. It then updates the global network with those gradients. Finally, the worker's local network is synchronized with the global network (local_net = global_net). All workers does that asynchronously (for more details in this step, check the work and train methods of the Worker class inside the Worker.py);
The LSTM states C and H are reset between episodes. It is also important to note that the current states C and H are kept locally by each worker;
To apply the gradients to the global network, I used the Adamoptimizer with learning rate = 1e-4.
I have already tried different configurations for the network (by trying several different convolutional layers configurations, including different activation functions), other optimizers (RMSPropOptimizer and AdadeltaOptimizer) with different parameters configurations, and different values to BATCH_SIZE. But it almost ends up diverging to a policy where it always executes only one action. I mean always because there are certain configurations where the agent maintains a policy similar to a random policy for several episodes, with no apparent improvements (I waited until 62k episodes before giving up in those cases).
Therefore, I would like to know if anyone have obtained success in training an agent in the Pong game using the A3C with a LSTM layer. If so, what are the parameters used? Any help would be appreciated!
[EDIT] As I said in the comments, I managed to partially solve the problem by feeding the correct LSTM state before calculating the gradients (instead of feeding an initialized LSTM state). This made the method learn reasonably well for the PongDeterministic environment. But the problem persists when I try the Breakout-v0: the agent reaches a mean score of 40 in about 65k episodes, but it seems to stop learning after this (it maintained this score for some time). I have checked the OpenAI starter agent several times and I can't find any significant differences between mine implementation with their's. Any help would be extremely appreciated!
I'm working in a distributed memory environment. My task is to simulate using particles tied by springs big 3D objects by dividing them into smaller pieces and each piece get simulated by another computer. I'm using a 3rd party physics engine to a achieve simulation. The problem I am facing is how to transmit the particle information in the extremities where the object is divided. This information is needed to compute interacting particle forces. The line in the image shows where the cut has been made. Because the number o particles is big the communication overhead will be big as well. Is there a good way to transmit such information or is there a way to transmit another value which helps me determine the information I need? Any help is much appreciated. Thank-you
PS: by particle information i mean the new positions from which to compute a resulting force to be applied on the particles simulated in the local machine
"Big" means lots of things. Here the number of points with data being communicated may be "big" in that it's much more than one, but if you have say a million particles in a lattice, and are dividing it between 4 processors (say) by cutting it into squares, you're only communicating 500 particles across each boundary; big compared to one but very small compared to 1,000,000.
A library very commonly used for these sorts of distributed-memory computations (which is somwehat different than distributed computing, which suggests nodes scattered all over the internet; this sort of computation, involving tightly-coupled elements, is usually best done with a series of nearby computers in a lab or in a cluster) is MPI. This pattern of communication is very common, and is called "halo exchange" or "guardcell exchange" or "ghostzone exchange" or some combination; you should be able to find lots of examples of such things by searching for those terms. (There are a few questions on this site on the topic, but they're typically focussed on very specific implementation questions).
In R. Kent Dybvig's paper "Three Implementation Models for Scheme" he speaks of "FFP languages" and "FFP machines". Apparently there is some correlation between FFP machines, and string-reduction on multiple processors.
Googling doesn't really uncover much in terms of explanations or examples.
Can anyone shed some light on this topic?
Thanks.
Kent Dybvig's advisor, Gyula A. Mago, published a detailed description in "The FFP Machine: Technical Report 87-014" in 1987 by Mago and Stanat.
As of this writing, the PDF is freely available at:
http://www.cs.unc.edu/techreports/87-014.pdf
The FFP Machine is a very fine-grained parallel computer architecture:
each processor holds a single symbol / atom / value.
It uses a string reduction model of computation in which
innermost function applications are found and replaced by their
equivalent result (eager evaluation).
Where a result is used in several places, it tends to be re-evaluated
instead of incurring the costs of accessing some global store
(but see Mago's paper on "Copying Operands vs Copying Results", or better yet Mago's "Data Sharing in an FFP Machine" in the 1982 Functional Programming Languages and Computer Architecture conference).
The L cells holding the FFP expression being reduced
communicate through a tree structured arrangement of T cells.
Note that IC's are basically two dimensional and with wiring,
circuits can move towards being three dimensional in physical space.
Interconnection networks that occupy higher dimensions
(such as the Hypercube, Omega, Banyan, Star, etc. networks)
will eventually be unable to perform near their theoretical limit.
This communication network is circuit-switched rather than being packet-switched.
Data packets contain no addresses and do not need routing.
Packets from distinct reductions cannot meet, cannot conflict
and cannot experience congestion with each other.
The configuring activity (called "Partitioning") is performed
in a single sweep upwards in the tree, using a handful
of logic operations on 3-bit messages, leaving "area machines" in its wake,
each created to advance at most a single reducible application.
While it is technically logarithmic in time,
the resulting area machines can begin communicating
in a pipelined fashion behind the partitioning wave,
practically costing a constant time penalty.
(The dismantling of area machines remains a logarithmic cost in time).
Packets within a single reduction should, and must, meet
and thus provide a often-useful synchronization.
Sequences of packets are sorted and combined as they rise
within an area, to be broadcast from the root of the area machine.
Parallel Prefix and Parallel Suffix operations are provided
to reduce area traffic, since there remains a potential bottleneck
within an individual reducible application.
This is accomplished without the need exhibited in
the Ultracomputer (Jack (Jacob?) Schwartz at NYU)
for a separate logarithmic-sized cache memory in each
communication node.
Each T cell (internal tree node) only needs a FIFO buffer
(for efficiency) of size greater than the pipeline path to
the top of the tree and back down.
(This latter is a conjecture of mine, but it seems reasonable).
Since the tree maintains the left-to-right order of data
(unlike some other combining networks), the system enables cells
to rotate their data in logarithmic rather than linear time,
avoiding the plausible congestion at the root of the area machine.
It's worth noting again, that the parallelism within an area
machine is independent of the simultaneous parallelism in other
area machines, and has available to it a number of processors
proportional to the quantity of data in the operand.
Have you come across this yet?: Compiling APL for parallel execution on an FFP machine
Formal FP. Similar to FP, but with regular sugarless syntax, for machine execution is all I can offer you.
See Wikis Fp page.