Increase students marks average and make mark adjustement for every strudent - math

I feel really confused trying to make it correct, but I'm not sure what is the right method to resolve this.
Eg : I have 5 students with their exam mark respectively
average : (1 + 2 + 3 + 4 + 5) / 5 = 3.00
So now, I want to add 5 to the average
new average : 3.00 + 5 = 8.00
Question: How to adjust students mark depending on the value added to the average?

If you want equal distribution among students, add the cumulative increased sum 25(number of students(5)*increased value) in respective ratio to total marks of each student.
i.e.
For first student add (1/15)*25 to his marks
For second student add (2/15)*25 to his marks
For third student add (3/15)*25 to his marks and so on

Related

Calculating a ratio in a ggplot2 graph while retaining faceting variables

So I don't think this has been asked before, but SO search might just be getting confused by combinations of 'ratio' and 'faceting'. I'm trying to calculate a productivity ratio; number of widgets produced for number of workers on a given day or period. I've got my data structured in a single data frame, with each widget produced each day by each worker in it's own record, and other workers that worked that day but didn't produce a widget also in their own record, along with various metadata.
Something like this:
widget_ind
employee_active_ind
employee_id
day
product_type
employee_bu
1
1
123
6/1/2021
pc
americas
0
1
234
6/1/2021
mac
emea
0
1
345
6/1/2021
mac
apac
1
1
444
6/1/2021
mac
americas
1
1
333
6/1/2021
pc
emea
0
1
356
6/1/2021
pc
americas
I'm trying to find the ratio of widget_inds to employee_active_inds, over time, while retaining the metadata, so that i can filter or facet within the ggplot2 code, something like:
plot <- ggplot(data = df[df$employee_bu == 'americas',],aes(y = (widget_ind/employee_active_ind), x = day)) +
geom_bar(stat = 'identity', position = 'stack') +
facet_wrap(product_type ~ ., scales = 'fixed') + #change these to look at different cuts of metadata
print(plot)
Retaining the metadata is appealing rather than making individual dataframes summarizing by the various combinations, but the results with no faceting aren't even correct (e.g. the ggplot is showing a barchart with a height of ~18 widgets per person; creating a summarized dataframe with no faceting is showing a ratio of less than 1 widget per person).
I'm currently getting this error when I run the ggplot code:
Warning message:
Removed 9865 rows containing missing values (geom_bar).
Which doesn't make sense since in my data frame both widget_ind and employee_active_ind have no NA values, so calculating the ratio of the two should always work?
Edit 1: Clarifying employee_active_ind: I should not have any employee_active_ind = 0, but my current joins produce them (and it passes the reality sniff test; the process we are trying to model allows you to do work on day 1 that results in a widget on day 2, where you may not do any work, so wouldn't be counted as active on that day). I think I need to re-think my data structure. Even so, I'm assuming here that ggplot2 is acting like it would for a given bar chart; it's taking the number in each widget_ind record, for a given day (along with any facets and filters), and is then summing that set and displaying the result. The wrinkle I'm adding is dividing by the number of active employees on that day, and while you can have some one out on a given day, you'd never have everyone out. But that isn't what ggplot is doing is it?
I agree with MrFlick - especially the question concerning employee_active_ind of 0. If you have them, this could create NA values where something is divided by 0.

R Question: How can I create a histogram with 2 variables against eachother?

Okay, let me be as clear as I can in my problem. I'm new to R, so your patience is appreciated.
I want to create a histogram using two different vectors. The first vector contains a list of models (products). These models are listed as either integers, strings, or NA. I'm not exactly sure how R is storing them (I assume they're kept as strings), or if that is a relevant issue. I also have a vector containing a list of incidents pertaining to that model. So for example, one row in the dataframe might be:
Model Incidents
XXX1991 7
How can I create a histogram where the number of incidents for each model is shown? So the histogram will look like
| =
| =
Frequency of | =
Incidents | = =
| = = =
| = = = = =
- - - - - -
Each different Model
Just to give a general idea.
I also need to be able to map everything out with standard deviation lines, so that it's easy to see which models are the least reliable. But that's not the main question here. I just don't want to do anything that will make me unable to use standard deviation in the future.
So far, all I really understand is how to make a histogram with the frequency marked, but for some reason, the x-axis is marked with numbers, not the models' names.
I don't really care if I have to download new packages to make this work, but I suspect that this already exists in basic R or ggplot2 and I'm just too dumb to figure it out.
Feel free to ask clarfying questions. Thanks.
EDIT: I forgot to mention, there are multiple rows of incidents listed under each model. So to add to my example earlier:
Model Incidents
XXX1991 7
XXX1991 1
XXX1991 19
3
5
XXX1002 9
XXX1002 4
etc . . .
I want to add up all the incidents for a model under one label.
I am assuming that you did not mean to leave the model blank in your example, so I filled in some values.
You can add up the number of incidents by model using aggregate then make the relevant plot using barplot.
## Example Data
data = read.table(text="Model Incidents
XXX1991 7
XXX1991 1
XXX1991 19
XXX1992 3
XXX1992 5
XXX1002 9
XXX1002 4",
header=TRUE)
TAB = aggregate(data$Incidents, list(data$Model), sum)
TAB
Group.1 x
1 XXX1002 13
2 XXX1991 27
3 XXX1992 8
barplot(TAB$x, names.arg=TAB$Group.1 )

R-return the name with the least number of occurences

I need to find the sector with the lowest frequency in my data frame. Using min gives the minimum number of occurrences, but I would like to obtain the corresponding sector name with the lowest number of occurrences...So in this case, I would like it to print "consumer staples". I keep getting the frequency and not the actual sector name. Is there a way to do this?
Thank you.
sector_count <- count(portfolio, "Sector")
sector_count
Sector freq
1 Consumer Discretionary 5
2 Consumer Staples 1
3 Health Care 2
4 Industrials 3
5 Information Technology 4
min(sector_count$freq)
[1] 1
You want
sector_count$Sector[which.min(sector_count$freq)]
The which.min(sector_count$freq) function selects the index or row where the minimum value is found. The sector_count$Sector vector is then subset to the corresponding value.

Math? IF Statements. Multiple scenarios

This is not high level Math, but I am struggling to apportion the appropriate amounts to my upsell types.
Type Revenue Sales Units
1st Position 24 3 3
2nd Position1 10 2
2nd Position2 5 1
Only my 1st positions count toward the return per sale values and they include the revenue of my 2nd positions that were generated from that sale.
IMPORTANT NOTE: If the return per sale value is greater than or equal to 12 I must apply a split for the amount over 12. My less than 12 value is *0.6 of the revenue, my greater than 12 value is *0.6 of amount less than 12 and *0.4 of amount over.
IF TotalRev/1st Position Units < 12 THEN TotalRev*.6 ELSE 7.2+(((TotalRev)/1stPositionUnits)-12)*.4)*1stPositionUnits = Total Revenue To Company.
IF (24+10+5)/3 < 12
THEN (24+10+5)*.6
ELSE ((.6*12)+((((24+10+5)/3)-12)*.4))*3
Net Revenue = 22.8
Net Return Per Sale = (Net Revenue/1st Position Units) = (22.8/3) = 7.6
Now I want to determine how much of the 1st position revenue contributed to the 7.6, how much of the 2nd position1 revenue and how much of the 3rd position...
My Attempt:
(1st Position Revenue/Total Revenue)*Net Return Per sale = (24/(24+10+5))*7.6 = 4.68
(2nd Position1 Revenue/Total Revenue)*Net Return Per sale = (10/(24+10+5))*7.6 = 1.95
(2nd Position2 Revenue/Total Revenue)*Net Return Per sale = (5/(24+10+5))*7.6 = 0.97
Double Check = (4.68+1.95+0.97) = 7.6
As your question, I guess this should be OK in your case.
Assuming the variables are defined with the same name as they are in the question(e.g., rev_1st_pos,rev_2nd_pos1,rev_2nd_pos2,new_rev_1st_pos,new_rev_2nd_pos1,new_rev_2nd_pos2,net_rev,net_return_per_sale,unit_1st_pos)...
The usual meanings for all the variables are :-
rev_1st_pos=Revenue of 1st Position
rev_2nd_pos1=Revenue of 2nd Position1
rev_2nd_pos2=Revenue of 2nd Position2
new_rev_1st_pos=New Revenue of 1st Position after total calculation
new_rev_2nd_pos1=New Revenue of 2nd Position1 after total calculation
new_rev_2nd_pos2=New Revenue of 2nd Position2 after total calculation
unit_1st_pos=1st position Units
net_rev=Net Revenue calculated as shown below
net_return_per_sale=Net Return per sale as calculated below
check= a variable used to check final value equals the net_return_per_sale.
Code :-
total_rev=(rev_1st_pos+rev_2nd_pos1+rev_2nd_pos2);
if((total_rev/unit_1st_pos)>=12)
{
net_rev=((((total_rev/unit_1st_pos)-12)*0.4)+(12*0.6))*unit_1st_pos; // OR simply substitute (12*0.6) by 7.2
}
else
{
net_rev=total_rev*0.6;
}
net_return_per_sale=net_rev/unit_1st_pos;
new_rev_1st_pos=(rev_1st_pos/total_rev)*net_return_per_sale;
new_rev_2nd_pos1=(rev_2nd_pos1/total_rev)*net_return_per_sale;
new_rev_2nd_pos2=(rev_2nd_pos2/total_rev)*net_return_per_sale;
double check=new_rev_1st_pos+new_rev_2nd_pos1+new_rev_2nd_pos2; // here check must be equal to net_return_per_sale
If there is still some issue,please leave a comment!

Sample exactly four maintaining almost equal sample distances

I am trying to generate appointment times for yearly scheduled visits. The available days=1:365 and the first appointment should be randomly chosen first=sample(days,1,replace=F)
Now given the first appointment I want to generate 3 more appointment in the space between 1:365 so that there will be exactly 4 appointments in the 1:365 space, and as equally spaced between them as possible.
I have tried
point<-sort(c(first-1:5*364/4,first+1:5*364/4 ));point<-point[point>0 & point<365]
but it does not always give me 4 appointments. I have eventually run this many times and picked only the samples with 4 appointments, but I wanted to ask if there is a more elegant way to get exactly 4 points as equally distanced a s possible.
I was thinking of equal spacing (around 91 days between appointments) in a year starting at the first appointment... Essentially one appointment per quarter of the year.
# Find how many days in a quarter of the year
quarter = floor(365/4)
first = sample(days, 1)
all = c(first, first + (1:3)*quarter)
all[all > 365] = all[all > 365] - 365
all
sort(all)
Is this what you're looking for?
set.seed(1) # for reproducible example ONLY - you need to take this out.
first <- sample(1:365,1)
points <- c(first+(0:3)*(365-first)/4)
points
# [1] 97 164 231 298
Another way uses
points <- c(first+(0:3)*(365-first)/3)
This creates 4 points euqally spaced on [first, 365], but the last point will always be 365.
The reason your code is giving unexpected results is because you use first-1:5*364/4. This creates points prior to first, some of which can be < 0. Then you exclude those with points[points>0...].

Resources