actual work hours larger than plan work hours in microsoft project - ms-project

I update my project actual work hour every week.But if actual work hours larger than plan work hours,the plan work hours is changed to equal to actual work hour automatic.Is there any way that actual work hours larger than plan work in ms project?

No, estimated work is always greater or equal to actual work and remaining work is always greater or equal to zero.
You can for example use baselines to store history of changes in Estimation.


Why is the fetch percentage inside the RemoteConfig parameter so small?

I perform 2 A/A tests for each of the two applications (ios/android), using a random toggle for 100% of the application users with a 50/50 distribution there.
However, on Android in RemoteConfig, the values for the variant do not exceed 40-45%, and on iOS-5-7%. At the same time, more than 5-7% of the total number of events were collected. Tell me please, what do these percentages mean in a particular example?
it's most likely you are seeing cached values from the clients as by default, Remote Config will cache values for about 12 hours before trying to download new values from the service.
Seeing this kind of split may take a day or more to see the results propagate which is also
based on your user's usage, which is compared to the total baseline.
It is recommended to read about this loading behavior HERE.

Manage project with tasks that can be completed in any order

Probably best described by example:
I have a project with 10 tasks of 1 day to be done by 1 resource. So they have to happen sequentially, easy to arrange by levelling or linking and the project will last 10 days. But this sets a task order, and it seems that Project expects them to be done in that order.
If they are, it is easy to get an indication from the standard reports about overall project progress. I can get a "late tasks" report which will show exactly that.
But the reality in our projects is that each task is done a bit at a time. So after 5 days, we are just as likely to have done 50% of all the tasks, or 100% of the last 5, as 100% of the first five and for us, that is equally satisfactory progress.
I have tried but I can't seem to find a way of knowing if the overall project is on track by work done regardless of task order. Is there an easy answer for this?

How to auto restrict the view in rpivottable to be data protection compliant

I am starting a customer lifetime project at work and want to share how the data looks with the business, as I want to be able to identify the important variables with them. I plan to do this using the excellent rpivottable package and launch a shiny app to see where there are basic differences in groups to select my features.
This would mean I have my customer base of 4million customers and slice and dice them in a number of ways.
However, following GDPR we need to ensure no group is shown that has less than 7 customers in it. Therefore I need somekind of background calculation to ensure that less than 7 customers are never shown.
If I think logically about this, the only way I could see it working would be to make a change to the pivottable, have some form of submit button, so that the size of groups could be calculated, and then a filter (which needs to be hidden from the user so it cannot be switched off) is applied.
I know I should provide code, but I do not know where to start here. Has anyone had similar issues and has a potential solution to all or part of the problem?
Has anyone built a hidden filter into their rpivottable?
Has anyone been able to restrict their output to only show 90% of their data?
To be absolutely sure, you would need to load in a data frame that looks like "dim, dim, dim, count" where count is always greater than 7. Basically just a bit of preprocessing on your input data. Unfortunately, this means that you will be restricted to a small number of coarse dimensions, else you will end up filtering out everything.

spark inconsistency when running count command

A question about inconsistency of Spark calculations. Does this exist? For example, I am running EXACTLY the same command twice, e.g.:
And I am getting slightly different results every time I run it (141,830, then 142,314)!
Or this:
and getting 2,587,013, and then 2,586,943. How is it even possible?
Thank you!
As per your comment, you are using sampleBy in your pipeline. sampleBydoesn't guarantee you'll get the exact fractions of rows. It takes a sample with probability for each record being included equal to fractions and can vary from run to run.
Regarding your monotonically_increasing_id question in the comments, it only guarantees that the next id is larger than the previous one, however, it doesn't guarantee ids are consecutive (i,i+i,i+2, etc...).
Finally, you can persist a data frame, by called persist() on it.
Ok, I have suffered majorly from this in the past. I had a seven or eight stage pipeline that normalised a couple of tables, added ids, joined them and grouped them. Consecutive runs of the same pipeline gave different results, although not in any coherent pattern I could understand.
Long story short, I traced this feature to my usage of the function monotonically_increasing_id, supposed resolved by this JIRA ticket, but still evident in Spark 2.2.
I do not know exactly what your pipeline does, but please understand that my fix is to force SPARK to persist results after calling monotonically_increasing_id. I never saw the issue again after I started doing this.
Let me know if a judicious persist resolves this issue.
To persist an RDD or DataFrame, call either df.cache (which defaults to in-memory persistence) or df.persist([some storage level]), for example
Again, it may not help you, but in my case it forced Spark to flush out and write id values which were behaving non-deterministically given repeated invocations of the pipeline.

How to retrieve a row's position within a DynamoDB global secondary index and the total?

I'm implementing a leaderboard which is backed up by DynamoDB, and their Global Secondary Index, as described in their developer guide,
But, two of the things that are very necessary for a leaderboard system is your position within it, and the total in a leaderboard, so you can show #1 of 2000, or similar.
Using the index, the rows are sorted the correct way, and I'd assume these calls would be cheap enough to make, but I haven't been able to find a way, as of yet, how to do it via their docs. I really hope I don't have to get the entire table every single time to know where a person is positioned in it, or the count of the entire table (although if that's not available, that could be delayed, calculated and stored outside of the table at scheduled periods).
I know DescribeTable gives you information about the entire table, but I would be applying filters to the range key, so that wouldn't suit this purpose.
I am not aware of any efficient way to get the ranking of a player. The dumb way is to do a query starting from the player with the highest point, move downward, keep incrementing your counter until you reach the target player. So for the user with lowest point, you might end up scanning the whole range.
That being said, you can still get the top 100 player with no problem (Leaders). Just do a query starting from the player with the highest point, and set the query limit to 100.
Also, for a given player, you can get 100 players around him with similar points. You just need do two queries like:
query with hashkey="" and rangekey <= his point, limit 50
query with hashkey="" and rangekey >= his point, limit 50
This was the exact same problem we were facing when we were developing our app. Following are two solutions we had come with to deal with this problem:
Query your index with scanIndex->false that will give you all top players (assuming your score/points key in range) with limit 1000. Then applying this mathematical formula y = mx+b where you can take 2 iteration, mostly 1 and last value to find out m and b, x-points, and y-rank. Based on this you will get the rank if you have user's points (this will not be exact rank value it would be approximate, google does the same if we search some thing in our mail it show
and not exact value in first call.
Get all the records and store it in cache until the next update. This is by far the best and less expensive thing we are using.
The beauty of DynamoDB is that it is highly optimized for very specific (and common) use cases. The cost of this optimization is that many other use cases cannot be achieved as easily as with other databases. Unfortunately yours is one of them. That being said, there are perfectly valid and good ways to do this with DynamoDB. I happen to have built an application that has the same requirement as yours.
What you can do is enable DynamoDB Streams on your table and process item update events with a Lambda function. Every time the number of points for a user changes you re-compute their rank and update your item. Even if you use the same scan operation to re-compute the rank, this is still much better, because it moves the bulk of the cost from your read operation to your write operation, which is kind of the point of NoSQL in the first place. This approach also keeps your point updates fast and eventually consistent (the rank will not update immediately, but is guaranteed to update properly unless there's an issue with your Lambda function).
I recommend to go with this approach and once you reach scale optimize by caching your users by rank in something like Redis, unless you have prior experience with it and can set this up quickly. Pick whatever is simplest first. If you are concerned about your leaderboard changing too often, you can reduce the cost by only re-computing the ranks of first, say, 100 users and schedule another Lambda function to run every several minutes, scan all users and update their ranks all at the same time.
