How to create a column in PL/SQL that is a listagg of other column names that have null values for each row - plsql

Like the title says I am trying to create a column in PL/SQL that is a list aggregation of other column names that have null values for each respective column in each row without writing a case statement with dozens of when clause.
I am working with date/time data with lots of nulls and would like to create a column that shows all of the dates that are missing data for a report.
So if dates were missing from Column A ,B, and C for a specific row then the new columns' value for that row would be 'A,B,C'
How would I do this without writing case statements for each combination of columns that could be missing dates? I've started writing the case statements for each combo for now but don't want this to be my permanent solution.
I appreciate any insight.
Thanks!
Here is an example of my data:
A
B
C
D
E
NULL
NULL
NULL
2010-05-10
2011-01-05
2011-05-13
NULL
NULL
2010-11-10
2011-11-30
2009-07-21
NULL
NULL
NULL
2011-02-02
I am trying to create column F.
A
B
C
D
E
F
NULL
NULL
NULL
2010-05-10
2011-01-05
A,B,C
2011-05-13
NULL
NULL
2010-11-10
2011-11-30
B,C
2009-07-21
NULL
NULL
NULL
2011-02-02
B,C,D

Related

Retrieve correspondent rows when startsWith == TRUE as numeric values

I am trying to write a script for a complete automation of my data.
Here, I am trying to implement a loop in order to search for all the values which start with "Blank" in the name column of my data.frame.
How can I print all the correspondent rows in a vector?
i.e. I have a value Blank C in the column names at row 5, I want to get a vector with the values of the same row (5) in columns names and the other data columns 3:6, not as NA but as values because the output is NA.
for (val in ms.data$Name) {
if(startsWith(val,"Blank")) {
print(list(ms.data[val,3:6]))
}
}
Example data
Blank 1 05-Apr-17 7:04 PM 5.771899218 4.922906441 219.0199184 7.779938257
Blank 2 05-Apr-17 7:15 PM 4.913695034 4.071889653 2.161167065 2.567102283
Thank you

SQLite, how to apply function to each cell in a column and then perform IN operation on transformed values?

I'm trying to set up a query to apply a function on cell values for a column prior to performing an IN check or rather, match rows between table two tables where a column's cell values in a table are a substring of the other table's column of the same data type, specifically Strings.
I need something like,
'A' IN ('A', 'B', 'C', 'D') from 'A' IN ('A|B', 'C|D').
What it comes down to, is the ability to say whether A from Table 1 is in Table 2.
Table 1 Table 2
------- --------
A A|B
B C|D
C
D

R: if a value is less or is na update another data.frame

I have two data.frames A and B.
A contains negative, absolute and NA values.
B contains only positive and NA values.
The dimensions of the data frames are the same.
data.frame A looks like this:
ENSMUSG00000000001.4/Gnai3 0.1943315 0.3021675 NA NA
ENSMUSG00000000003.9/Pbsn -1.4843914 -1.2608270 -0.2587953 -0.46167430
ENSMUSG00000000028.8/Cdc45 -0.2388901 -0.1106236 0.9046436 0.08968331
ENSMUSG00000000037.9/Scml 0.3242902 0.5385371 0.2311202 0.51110287
ENSMUSG00000000049.5/Apoh -1.7606033 -1.8159545 -0.2087083 -1.09614630
ENSMUSG00000000056.7/Narf NA NA -0.3747798 -0.55547798
I need to check if a value is NA or negative in this table then I need to update data.frame B on the same indices to the value 0.999.
For example:
The first record of A has two NA values, indexes are [1,4] and [1,5] meaning, I will update B[1,4]=0.999 and B[1,5]=0.999.
I could do this in the nested loops for columns and rows but it would take too much time. Is there a faster way?
You can pass a Boolean mask as an index if it's the same size:
b[is.na(a) | a < 0] <- 0.999
I would use ifelse to do this, since the dataframes have the same dimensions.
A<-matrix(data=1:15,nrow=5) # create matrices (works with dataframe as well)
B<-matrix(data=16:30,nrow=5)
B[1,2]<-NA # introduce some NA and negative values
B[5,3]<-(-1)
ifelse(is.na(B) | B<=0,A,B) # new matrix with "updated" values

Problems with using subset in r

I need to subset my data frame, but I do not know what condition to use.
df2<-subset(df, condition )
A part of the dataframe, `df`:
state value
a 1
b 2
c 3
a 1
b 4
c 5
I count the sum of the value column for each state using : table(df$state)
I need to create a date frame where I show just the rows where the sum of the value column is bigger then a given value x.
If x is 3, I need to have in the new data frame just the rows that have the "state" column equal to b or c.
What should I replace "condition" with? How can I use : table(df$state) in the condition?
It is not clear what are you trying to do.
table(df$state) count the occurence of each state in your data, not the sum of variable "value" for each "state".You should instead use something like this:
vv <- tapply(dat$value,dat$state,sum)
vv
a b c
2 6 8
Now you can use the result within subset, to get the sum of the value column is bigger then a given value x. For example x == 3:
subset(dat,state %in% names(vv)[vv>3])
or without using `subset ( more efficient)
dat[dat$state %in% names(vv)[vv>3],]

Need the excel formula for following function

I need help to do the following function in a MS Excel sheet. The sheet example is as follows
A B C D E
1 TimeStamp Name Amount UsedBy Description
-----------------------------------------------------------
2 Date1 Me1 200 He1,She1 desc1
3 Date2 Me1 100 Me1,He1 desc2
4 Date3 She1 50 He1,She1,Me1 desc3
5 Date4 He1 70 She1,He1 desc4
6 Date5 She1 200 She1,He1,Me1 desc5
7 Date6 Me1 22 He1 desc6
I want some function which can do the following sequence of job in a single customized MS-Excel formula
Sum the cells of column "Amount" where "UsedBy" column cells contain "He1" as a single entity. Lets say result is X
Sum of the cells of column "Amount" where "UsedBy" column cells contain two entities and "He1" must be one entity. After this sum devide it by 2. Lets say result is Y.
Sum of the cells of column "Amount" where "UsedBy" column cells contain three entities and "He1" must be one entity. After this sum devide it by 3. Lets say result is Z
Total the result in steps 1,2 and 3. That means the sum of X+Y+Z
Please let me know if I am not clear in my question....
Try the SUMIF function.
Build some intermediate results like the number of values in UsedBy, or whether UsedBy contains He1 in separate columns, then use SUMIF().
You can't do this in a single formula unless you write it yourself in VBA. Since you haven't tagged the question as VBA I'll assume you'd rather use helper columns.
You'll need 3 helper columns, 1 for each of your criteria.
For your first let's say you put it in column F
=if(and(isnumber(search("He1",D2)),len(d2)=len(substitute(d2,",",""))),1,0)
What this does is ensures that D2 contains 'He1' and makes sure there are no commas.
For your second put it in column G
=if(and(isnumber(search("He1",D2)),len(d2)-1=len(substitute(d2,",",""))),1,0)
What this does is ensures that D2 contains 'He1' and makes sure there is 1 comma.
For your third put it in column H
=if(and(isnumber(search("He1",D2)),len(d2)-2=len(substitute(d2,",",""))),1,0)
What this does is ensures that D2 contains 'He1' and makes sure there are 2 commas.
Once you have your helper criteria columns you can now do a sumif for each critera.
For X you'll do =sumif(f2:f7,1,c2:c7)
For Y you'll do =sumif(g2:g7,1,c2:c7)/2
For Z you'll do =sumif(h2:h7,1,c2:c7)/3

Resources