Query formatting - asp.net

I have table with below rows
Name Month Salary Expense
John Jan 1000 50
John Feb 5000 2000
Jack Jan 3000 100
I want to get output in the below format. How to achieve this.
Name JAN FEB
John 1000 50 5000 2000
Jack 3000 100 0 0

This sql(-server) query would work:
select name,
isnull(max(case when month='jan' then salary end), 0) as Salary_jan,
isnull(max(case when month='feb' then salary end), 0) as Salary_feb
-- and so on
group by name

Related

How to number repeated values in a column in R?

I have a big dataset where some names are repeated, like below.
Name
Year
Value
AH
2013
1800
AH
2014
2400
AH
2015
2300
BC
2013
1900
BC
2014
1600
KP
2013
3600
DN
2013
2800
I'd like to know how to create a column for numbering repeated names sequentially.
Name
Year
Value
Number
AH
2013
1800
1
AH
2014
2400
2
AH
2015
2300
3
BC
2013
1900
1
BC
2014
1600
2
KP
2013
3600
1
DN
2013
2800
1
I found a previous ask with the following code
library(data.table)
library(dplyr)
weighted_df %>%
mutate(Number = rowid(Name))
but I keep getting "Error in is.data.frame(.data) : object 'weighted_df' not found. I'm quite new to R so I'm unsure of what else I can try out, or if I'm using the wrong libraries for the function.
Any help is greatly appreciated! Thanks y'all!

getting sum of records based on date range in sqlite

This is the table I created
CREATE TABLE "PaymentOut" (
"poID" INTEGER NOT NULL,
"poSNO" INTEGER,
"poMType" TEXT,
"poType" TEXT,
"poSubType" TEXT,
"poName" TEXT,
"poDesc" TEXT,
"poAmount" INTEGER,
"poDate" TEXT,
PRIMARY KEY("poID" AUTOINCREMENT)
);
this is the data stored in PaymentOut table
what i want is to sum poAmount only if it is between specified date range
this query is what i am executing
SELECT p.poName,sum(CASE WHEN p.poAmount BETWEEN '12/1/2021' AND '12/30/2021' THEN p.poAmount ELSE 0 END) as Advance FROM PaymentOut p WHERE p.poSubType = 'emp' GROUP by p.poName
OUTPUT OF Above query:
poName Advance
Ikram 0
Rashid 0
Saeed 0
DesiredOutup ShouldBE:
poName Advance
Ikram 100
Rashid 1000
Saeed 3000
I can get the output by running the query without specifying date
SELECT p.poName,sum(p.poAmount) as Advance FROM PaymentOut p WHERE p.poSubType = 'emp' GROUP by p.poName
poName Advance
Ikram 100
Rashid 1000
Saeed 3000
but my requirement is to get this result with a given date range
I fixed the issue thanks to #forpas and also by creating a view and putting date range to where condition when querying that view.
What I was trying before: table data(note how dates are stored) looked something like this:
22 2 cash expenses emp Saeed 1000 12/5/2021
23 3 cash expenses emp Ikram 100 12/5/2021
24 4 cash expenses emp Rashid 1000 12/7/2021
25 5 cash expenses emp Saeed 1000 12/8/2021
26 6 cash expenses emp Rashid 1000 12/5/2021
Query:
SELECT
p.poName,
SUM(CASE WHEN p.poAmount BETWEEN '12/1/2021' AND '12/30/2021' THEN p.poAmount ELSE 0 END) AS Advance
FROM
PaymentOut p
WHERE
p.poSubType = 'emp'
GROUP BY
p.poName
Output of this query:
poName Advance
---------------
Ikram 0
Rashid 0
Saeed 0
First I changed the format of date to one #forpas mentioned this is how my data in table looks like:
22 2 cash expenses emp Saeed 1000 2021-12-09
23 3 cash expenses emp Ikram 100 2021-12-09
24 4 cash expenses emp Rashid 1000 2021-12-20
25 5 cash expenses emp Saeed 1000 2021-12-09
26 6 cash expenses emp Rashid 1000 2021-12-09
27 7 cash expenses emp Saeed 1000 2021-12-09
28 8 cash expenses emp Abdul 1000 2021-12-09
29 9 cash expenses emp Shahid 1000 2021-12-09
Then I created view and selected all the columns I require:
CREATE VIEW vempAdv
AS
SELECT p.poName, p.poDate, p.poAmount
FROM PaymentOut p
WHERE p.poSubType = 'emp'
Then I ran a query against that view i.e
SELECT
poName,
SUM(poAmount) AS adv
FROM
vempAdv
WHERE
poDate BETWEEN '2021-12-01' AND '2021-12-20'
GROUP BY
poName
Output of that is:
Abdul 1000
Ikram 100
Rashid 2000
Saeed 3000
Shahid 1000
Note: data is changed from the question I asked.
#forpas thank you for taking your time and helping me out

sqlite multiple query conditions

I've searched but can't find the right answer, and I'm going round in circles.
I have
CREATE TABLE History (yr Int, output Int, cat Text);
yr output cat
---------- ---------- ----------
2015 10 a
2016 20 a
2017 30 a
2018 50 a
2019 70 a
2015 100 b
2016 200 b
2017 300 b
2018 500 b
2019 700 b
2015 1000 c
2016 2000 c
2017 3000 c
2018 5000 c
2019 7000 c
2015 10000 d
2016 20000 d
2017 30000 d
2018 50000 d
2019 70000 d
I've created two views
CREATE VIEW Core AS select * from History where cat = "c" or cat = "d";
CREATE VIEW Plus AS select * from History where cat = "a" or cat = "b";
My query is
select distinct yr, sum(output), (select sum(output) from core group by yr) as _core, (select sum(output) from plus group by yr) as _plus from history group by yr;
yr sum(output) _core _plus
---------- ----------- ---------- ----------
2015 11110 11000 110
2016 22220 11000 110
2017 33330 11000 110
2018 55550 11000 110
2019 77770 11000 110
Each of the individual queries works but _core and _plus columns are wrong when it's all put together. How should I approach this please.
You may generate your expected output without a view, using a single query with conditional aggregation:
SELECT
yr,
SUM(output) AS sum_output,
SUM(CASE WHEN cat IN ('c', 'd') THEN output ELSE 0 END) AS _core,
SUM(CASE WHEN cat IN ('a', 'b') THEN output ELSE 0 END) AS _plus
FROM History
GROUP BY
yr;
If you really wanted to make your current approach work, one way would be to just join the two views by year. But that would leave open the possibility that each view might not have every year present.

Add row with group sum in new column at the end of group category

I have been searching this information since yesterday but so far I could not find a nice solution to my problem.
I have the following dataframe:
CODE CONCEPT P. NR. NAME DEPTO. PRICE
1 Lunch 11 John SALES 160
1 Lunch 11 John SALES 120
1 Lunch 11 John SALES 10
1 Lunch 13 Frank IT 200
2 Internet 13 Frank IT 120
and I want to add a column with the sum of rows by group, for instance, the total amount of concept: Lunch, code: 1 by name in order to get an output like this:
CODE CONCEPT P. NR. NAME DEPTO. PRICE TOTAL
1 Lunch 11 John SALES 160 NA
1 Lunch 11 John SALES 120 NA
1 Lunch 11 John SALES 10 290
1 Lunch 13 Frank IT 200 200
2 Internet 13 Frank IT 120 120
So far, I tried with:
aggregate(PRICE~NAME+CODE, data = df, FUN = sum)
But this retrieves just the total of the concepts like this:
NAME CODE TOTAL
John 1 290
Frank 1 200
Frank 2 120
And not the table with the rest of the data as I would like to have it.
I also tried adding an extra column with NA but somehow I cannot paste the total in a specific row position.
Any suggestions? I would like to have something I can do in BaseR.
Thanks!!
In base R you can use ave to add new column. We insert the sum of group only if it is last row in the group.
df$TOTAL <- with(df, ave(PRICE, CODE, CONCEPT, PNR, NAME, FUN = function(x)
ifelse(seq_along(x) == length(x), sum(x), NA)))
df
# CODE CONCEPT PNR NAME DEPTO. PRICE TOTAL
#1 1 Lunch 11 John SALES 160 NA
#2 1 Lunch 11 John SALES 120 NA
#3 1 Lunch 11 John SALES 10 290
#4 1 Lunch 13 Frank IT 200 200
#5 2 Internet 13 Frank IT 120 120
Similar logic using dplyr
library(dplyr)
df %>%
group_by(CODE, CONCEPT, PNR, NAME) %>%
mutate(TOTAL = ifelse(row_number() == n(), sum(PRICE) ,NA))
For a base R option, you may try merging the original data frame and aggregate:
df2 <- aggregate(PRICE~NAME+CODE, data = df, FUN = sum)
out <- merge(df[ , !(names(df) %in% c("PRICE"))], df2, by=c("NAME", "CODE"))
out[with(out, order(CODE, NAME)), ]
NAME CODE CONCEPT PNR DEPT PRICE
1 Frank 1 Lunch 13 IT 200
3 John 1 Lunch 11 SALES 290
4 John 1 Lunch 11 SALES 290
5 John 1 Lunch 11 SALES 290
2 Frank 2 Internet 13 IT 120

Extrapolate missing data for each group by average percentage of change

I have a data frame containing average income by zip code, for the years 2010-2014. I want data for the years 2015-2017, so I'm looking for a way to extrapolate this based on the yearly average change of each zip code group for the years available.
For example:
year zip income
2010 1111 5000
2011 1111 5500
2012 1111 6000
2013 1111 6500
2014 1111 7000
2010 2222 5000
2011 2222 6000
2012 2222 7000
2013 2222 8000
2014 2222 9000
Should (roughly) have:
year zip income
2010 1111 5000
2011 1111 5500
2012 1111 6000
2013 1111 6500
2014 1111 7000
2015 1111 7614
2016 1111 8282
2017 1111 9009
2010 2222 5000
2011 2222 6000
2012 2222 7000
2013 2222 8000
2014 2222 9000
2015 2222 10424
2016 2222 12074
2017 2222 13986
Based on an average growth of 8.78% for zip code 1111 and 15.83% for zip code 2222.
Here's a very quick messy data.table idea
library(data.table)
#Create data
last_year <- 2014
dt <- data.table(year=rep(2010:last_year,2),
zip=c(rep(1111,5),rep(2222,5)),
income=c(seq(5000,7000,500),seq(5000,9000,1000)))
#Future data
dt_fut <- data.table(year=rep((last_year+1):2017,2),
zip=c(rep(1111,3),rep(2222,3)),
income=rep(NA_integer_,6))
#calculate mean percentage change per year
dt[,avg_growth:=mean(diff(log(income))),by=zip]
#bind old with future data
dt <- rbindlist(list(dt,dt_fut),fill=T);setorder(dt,zip,year)
#carry last value forward replace NA
dt[,avg_growth:=na.locf(avg_growth),by=zip][,income:=na.locf(income),by=zip]
#calculate
# after 2014+1 (2015) then replace income
# with income*cumulative product of the average growth (1+r)-1
dt[year>=last_year+1,income:=income*cumprod(1+avg_growth)-1,by=zip][]

Resources