Here's my table:
sqlite> SELECT * from portafolio WHERE my_id=1;
id my_id stock name shares price date
---------- ---------- ---------- ------------- ---------- ---------- -------------------
1 1 NFLX Netflix, Inc. 2 133.26 2017/01/19 06:40AM
2 1 GM General Motor 1 37.47 2017/01/19 06:40AM
3 1 NFLX Netflix, Inc. -2 133.26 2017/01/19 06:41AM
4 1 NFLX Netflix, Inc. 4 133.26 2017/01/19 06:41AM
I'd like to make the price negative if shares is negative.
I know that I have to use CASE but don't know how to properly implement it, here's my attempt:
sqlite> SELECT CASE
...> WHEN shares<0
...> THEN price=price*-1
...> END, stock,shares,price FROM portafolio;
Output:
CASE
WHEN shares<0
THEN price=price*-1
END stock shares price
------------------------------------------ ---------- ---------- ----------
NFLX 2 133.26
GM 1 37.47
0 NFLX -2 133.26
NFLX 4 133.26
ST 1 41.2
HA 5 56.65
0 ST -1 41.2
0 HA -3 56.65
0 HA -2 56.65
GM 1 37.47
0 GM -1 37.47
Any help would be appreciated guys.
You don't have any ELSE case in your query. Without ELSE there will be no results if shares is less than zero. Also, you don't need to assign price * -1 to price. Just use it like this:
CASE WHEN shares < 0
THEN price * -1
ELSE price
END
Related
I need help with the following dataset:
ID Code SAStime
001 1 0
001 1 600
001 1 1200
... ... ...
001 1 84600
001 2 85200
001 2 85800
I would like to be able to tell the program, that if Code=1 between SAStime 0 and 85800 then to delete those row of data. So I have something left, like this:
ID Code SAStime
001 2 85200
001 2 85800
I've tried with drop, keep and where functions but for some reason it's not working.
I think you are asking about SAS grammar, you can learn it from .
As for this question, the anwser is below:
data have;
input ID$ Code SAStime;
cards;
001 1 0
001 1 600
001 1 1200
001 1 84600
001 2 85200
001 2 85800
;
run;
data want;
set have;
if Code = 1 and 0 <= SAStime <= 85800 then delete;
run;
So, I've got a table.
sqlite> .schema
CREATE TABLE data(date int primary key, temp text, humi text, co2 text, coarse int);
It's got some data, and using a WHERE condition on a TEXT column does what I'd expect:
sqlite> SELECT * FROM data WHERE temp > 26;
date temp humi co2 coarse
---------- ---------- ---------- ---------- ----------
1569962962 26.01 30.97 530.34 1
1569963029 26.05 30.91 528.57 0
1569963097 26.05 30.87 530.16 0
1569963164 26.09 30.83 530.37 1
1569963232 26.09 30.84 530.75 0
1569963300 26.13 30.77 532.51 0
It also does what I expect when I give it a condition none of the rows match:
sqlite> select * from data where temp > 99;
sqlite>
Unless I use 100, or 1000, etc. Then it ignores the WHERE and gives me every row:
sqlite> select * from data where temp > 100;
date temp humi co2 coarse
---------- ---------- ---------- ---------- ----------
1569967795 25.99 31.65 558.03 1
1569967863 26.01 31.60 558.78 0
1569967930 26.02 31.64 557.77 0
1569967998 26.01 31.65 556.68 1
1569968067 26.02 31.63 557.31 0
1569968134 26.04 31.64 560.01 0
1569968201 26.08 31.66 559.84 1
1569968268 26.05 31.66 563.95 0
1569968335 26.08 31.70 562.86 0
1569968403 26.09 31.69 563.85 1
1569968471 26.09 31.73 565.58 0
1569968539 26.11 31.69 566.04 0
1569968607 26.13 31.69 564.95 1
1569968674 26.13 31.62 565.51 0
1569968742 26.16 31.63 567.40 0
1569968810 26.16 31.60 568.38 1
[snip]
I, of course, discovered this by doing a DELETE operation with a WHERE clause to remove some bad data. It's okay, I've mostly stopped crying now. (The historical sensor data was not important) But why the behavior on multiples of 10? I assume it's doing something too-clever with flexible typing, but I don't see where.
It's because you are doing a textual comparison (due to temp having a TEXT column affinity) so 1 is lower than 200 (i.e. the character 1 is lower than the value 2 so the other characters are insignificant).
You need to force a numerical comparison and this can be done by CASTing e.g.
SELECT * FROM data WHERE CAST(temp AS REAL) > 99;
or
SELECT * FROM data WHERE temp > CAST(99 AS REAL);
You may wish to have a look at CAST expressions
If the column type of the temp column were REAL as per
CREATE TABLE data(date int primary key, temp REAL /*<<<<<<<<<< CHANGED */, humi text, co2 text, coarse int);`
then the CAST would not be required.
I have one dataset which includes all the points of students and other variables.
I further have a diagonal matrix which includes information on which student is a peer of another student.
Now I would like to use the second matrix (network) to calculate the mean-peer-points for each student. Everyone can have different (number of) peers.
To calculate the mean, I recalculated the simple 0,1 matrix into percentages, whereby the denominator is the sum of the number of peers one student has.
The second matrix then would look something like this:
ID1 ID2 ID3 ID4 ID5
ID1 0 0 0 0 1
ID2 0 0 0.5 0.5 0
ID3 0 0.5 0 0 0.5
ID4 0 0.5 0 0 0.5
ID5 0.33 0 0.33 0.33 0
And the points of each students is a simple variable in another dataset, and I would like to have the peers-average-points in as a second variable:
ID Points Peers
ID1 45 11
ID2 42 33.5
ID3 25 26.5
ID4 60 26.5
ID5 11 43.33
Are there any commands in Stata for that problem? I am currently looking into the Stata commands nwcommands, but I am unsure whether it can help. I could use solutions for Stata and R.
Without getting too creative, you can accomplish what you are trying to do with reshape, collapse and a couple of merges in Stata. Generally speaking, data in long format is easier to work with for this type of exercise.
Below is an example which produces the desired result.
/* Set-up data for example */
clear
input int(id points)
1 45
2 42
3 25
4 60
5 11
end
tempfile points
save `points'
clear
input int(StudentId id1 id2 id3 id4 id5)
1 0 0 0 0 1
2 0 0 1 1 0
3 0 1 0 0 1
4 0 1 0 0 1
5 1 0 1 1 0
end
/* End data set-up */
* Reshape peers data to long form
reshape long id, i(Student) j(PeerId)
drop if id == 0 // drop if student is not a peer of `StudentId`
* create id variable to use in merge
replace id = PeerId
* Merge to points data to get peer points
merge m:1 id using `points', nogen
* collapse data to the student level, sum peer points
collapse (sum) PeerPoints = points (count) CountPeers = PeerId, by(StudentId)
* merge back to points data to get student points
rename StudentId id
merge 1:1 id using `points', nogen
gen peers = PeerPoints / CountPeers
li id points peers
+------------------------+
| id points peers |
|------------------------|
1. | 1 45 11 |
2. | 2 42 42.5 |
3. | 3 25 26.5 |
4. | 4 60 26.5 |
5. | 5 11 43.33333
+------------------------+
In the above code, I reshape your peer data into long form data and keep only student-peer pairs. I then merge this data to the points data to get the points of each students peers. From here, I collapse the data back to the student level, totaling peer points and peer count in the process. At this point, you have total points for the peers of each student and the number of peers each student has. Now, you simply have to merge back to the points data to get the subject students points and divide total peer points (PeerPoints) by the number of peers the student has (CountPeers) for average peer points.
nwcommands is an outstanding package I have never used or studied, so I will just try the problem from first principles. This is all matrix algebra, but given a matrix and a variable, I would approach it like this in Stata.
clear
scalar third = 1/3
mat M = (0,0,0,0,1\0,0,0.5,0.5,0\0,0.5,0,0,0.5\0,0.5,0,0,0.5\third,0,third,third,0)
input ID Points Peers
1 45 11
2 42 33.5
3 25 26.5
4 60 26.5
5 11 43.33
end
gen Wanted = 0
quietly forval i = 1/5 {
forval j = 1/5 {
replace Wanted = Wanted + M[`i', `j'] * Points[`j'] in `i'
}
}
list
+--------------------------------+
| ID Points Peers Wanted |
|--------------------------------|
1. | 1 45 11 11 |
2. | 2 42 33.5 42.5 |
3. | 3 25 26.5 26.5 |
4. | 4 60 26.5 26.5 |
5. | 5 11 43.33 43.33334 |
+--------------------------------+
Small points: Using 0.33 for 1/3 doesn't give enough precision. You'll have similar problems for 1/6 and 1/7, for example.
Also, I get that the peers of 2 are 3 and 4 so their average is (25 + 60)/2 = 42.5, not 33.5.
EDIT: A similar approach starts with a data structure very like that imagined by #ander2ed
clear
input int(id points id1 id2 id3 id4 id5)
1 45 0 0 0 0 1
2 42 0 0 1 1 0
3 25 0 1 0 0 1
4 60 0 1 0 0 1
5 11 1 0 1 1 0
end
gen wanted = 0
quietly forval i = 1/5 {
forval j = 1/5 {
replace wanted = wanted + id`j'[`i'] * points[`j'] in `i'
}
}
egen count = rowtotal(id1-id5)
replace wanted = wanted/count
list
+--------------------------------------------------------------+
| id points id1 id2 id3 id4 id5 wanted count |
|--------------------------------------------------------------|
1. | 1 45 0 0 0 0 1 11 1 |
2. | 2 42 0 0 1 1 0 42.5 2 |
3. | 3 25 0 1 0 0 1 26.5 2 |
4. | 4 60 0 1 0 0 1 26.5 2 |
5. | 5 11 1 0 1 1 0 43.33333 3 |
+--------------------------------------------------------------+
I have 2 tables :
payments:
id amount type code
1 1200 0 111
2 100 1 111
3 200 0 111
4 50 0 112
5 500 2 112
6 300 3 113
bills:
id details code
-----------------------
1 bill-1 111
2 bill-2 112
3 bill-3 113
4 bill-4 114
I wanted to sum the amounts in payments table and join it with bills like below
result:
bills.code type0Sum type1Sum type2Sum type3Sum
-------------------------------------------------------------------------
111 1400 100 0 0
112 50 0 500 0
113 0 0 0 300
114 0 0 0 0
Sorry if this is a newbie question
[Edit]
I have used a similar query as below :
SELECT *
FROM bills,
(SELECT SUM(amount) AS type0Sum, code
FROM payments
WHERE type = 0
GROUP BY code)
AS sub1,
(SELECT SUM(amount) AS type1Sum, code
FROM payments
WHERE type = 1
GROUP BY ref_code)
AS sub2
WHERE bills.code = sub1.code
AND bills.code = sub2.code
But I am getting only the rows those having the type like :
bills.code type0Sum type1Sum type2Sum type3Sum
-------------------------------------------------------
111 1400 100
I've modified that final query to do proper joins, not the old joins that you were doing (read up on cartesian joins). Give this one a go for you, see if it works;
SELECT b.code
,sub1.type0Sum
,sub2.type1Sum
FROM bills b
LEFT JOIN (
SELECT SUM(amount) AS type0Sum
,code
FROM payments
WHERE type = 0
GROUP BY code
) AS sub1 ON b.code = sub1.code
LEFT JOIN (
SELECT SUM(amount) AS type1Sum
,code
FROM payments
WHERE type = 1
GROUP BY ref_code
) AS sub2 ON b.code = sub2.code
There are other ways of doing this that are more efficient but I've kept to your query in order to help you learn.
my table structure is :
table_system:
"ID" NUMBER NOT NULL ENABLE,
"COUNTRY" VARCHAR2(10 BYTE) NOT NULL ENABLE,
"COMPANYCODE" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"SYSTEM" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"NOTSTARTED" NUMBER,
"RUNNING" NUMBER,
"COMPLETED" NUMBER,
"ACTUALSTARTTIME" VARCHAR2(5 BYTE),
"ACTUALENDTIME" VARCHAR2(5 BYTE),
"SEQUENCE" NUMBER,
"PLANNEDSTARTTIME" VARCHAR2(5 BYTE),
"PLANNEDENDTIME" VARCHAR2(5 BYTE),
"ESTIMATEDENDTIME" VARCHAR2(5 BYTE),
CONSTRAINT "SYSTEMRUNTIME_PK" PRIMARY KEY ("ID", "COUNTRY", "COMPANYCODE", "SYSTEM") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "SYSTEM" ENABLE
I need an output that will fetch me the following output:
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8 … SYSTEM N
-------------------------------------------------- --------------------------- ------------------- ----------------------- ------------------------------ ------------------------------ -------------------- -------------- -------------- -------------- --------------
where systems are sorted as per the "SEQUENCE" attribute.
I have tried this query :
select distinct companycode, sequence, system,notstarted,running,completed
from table_system
where id = (select max(id) from table_system)
order by companycode, sequence
this fetches me the following
COMPANYCODE SEQUENCE SYSTEM NOTSTARTED RUNNING COMPLETED
-------------------------------------------------- ---------------------- -------------------------------------------------- ---------------------- ---------------------- ----------------------
1001 Helsinki Branch 1 GAP 2 / Datastage GL 0 0 3
1001 Helsinki Branch 2 SAP GL 0 0 2
1001 Helsinki Branch 3 SAP BW 0 0 2
1002 Copenhagen Branch 1 GAP 2 / Datastage GL 0 0 3
1002 Copenhagen Branch 2 SAP GL 0 0 2
1002 Copenhagen Branch 3 SAP BW 0 0 2
1003 Oslo Branch 1 GAP 2 / Datastage GL 0 0 3
1003 Oslo Branch 2 SAP GL 0 0 2
1003 Oslo Branch 3 SAP BW 0 0 2
1004 (publ) (EUR) 1 EKO 0 0 13
1004 (publ) (EUR) 2 HA Core 0 0 6
1004 (publ) (EUR) 3 HA Post Processor 0 0 5
1004 (publ) (EUR) 4 Datastage GL 3 0 10
1004 (publ) (EUR) 5 Datastage Recon 1 0 3
1004 (publ) (EUR) 11 SAP GL 0 0 4
1004 (publ) (EUR) 21 SAP BW 0 0 4
but I want the output to be :
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8 … SYSTEM N
-------------------------------------------------- --------------------------- ------------------- ----------------------- ------------------------------ ------------------------------ -------------------- -------------- -------------- -------------- --------------
1001 Helsinki Branch GAP 2 / Datastage GL SAP GL SAP BW
1002 Copenhagen Branch GAP 2 / Datastage GL SAP GL SAP BW
1003 Oslo Branch GAP 2 / Datastage GL SAP GL SAP BW
1004 (publ) (EUR) EKO HA Core HA Post Processor Datastage GL Datastage Recon SAP GL SAP BW
Any hint for the above will be highly appreciated.
Thank You
vinayak
Try that:
select companycode, COLLECT(system) as systems
from table_system
where id = (select max(id) from table_system)
group by companycode
order by companycode, sequence
You can use a pivot operation for this; but can't have an unknown number of systems to handle (as you need to know the number of selected columns at parse time):
select * from
(
select companycode, system,
row_number() over (partition by id, country, companycode
order by sequence) as rn
from table_system
where id = (select max(id) from table_system)
)
pivot (max(system) for rn in (1 as system1, 2 as system2, 3 as system3,
4 as system4, 5 as system5, 6 as system6, 7 as system7, 8 as system8))
order by company code;
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
1001 Helsinki Branch GAP 2 / Datastage GL SAP GL SAP BW
1002 Copenhagen Branch GAP 2 / Datastage GL SAP GL SAP BW
1003 Oslo Branch GAP 2 / Datastage GL SAP GL SAP BW
1004 (publ) (EUR) EKO HA Core HA Post Processor Datastage GL Datastage Recon SAP GL SAP BW
So you'd need to establish the maximum number of systems you'll ever have present, and add clauses to the pivot (9 as system9, ...) to accommodate them all. The row_number() translates the sequence numbers into a contiguous number, so you don't have a big gap between the 5th and 6th systems for company 1004; apart from anything else you'd need the pivot to handle the maximum possible sequence number rather than the maximum count of systems.