An example of what I'm trying to do is given below. For each person, I want a query that will look at each reason and calculate a sum of points based on: if A-F are present the points will be calculated as follows for John 10+20+30+40+50-60, i.e., A+B+C+D+E-F. If F isn't present, then it's a straight sum of the points (for Paul).
ID name points reason
1 John 10 A
2 John 20 B
3 John 30 C
4 John 40 D
5 John 50 E
6 John 60 F
7 Paul 5 A
8 Paul 10 B
9 Paul 15 C
10 Paul 20 D
11 Paul 25 E

Try this:
Select [name],
Sum(IIF([reason] = "F", -[points], [points])) As TotalPoints
From YourTable
Group By [name]
Learn more about iif here:

There is no difference between 10+20+30+40+50-60 and what you call a straight sum.
So, all you need is to group by the name:
Select [name], Sum([points]) As TotalPoints
From YourTable
Group By [name]


Find the favorite and analyse sequence questions in R

We have a daily meeting when participants nominate each other to speak. The first person is chosen randomly.
I have a dataframe that consists of names and the order of speech every day.
I have a day1, a day2 ,a day3 , etc. in the columns.
The data in the rows are numbers, meaning the order of speech on that particular day.
NA means that the person did not participate on that day.
Name day1 day2 day3 day4 ...
Albert 1 3 1 ...
Josh 2 2 NA
Veronica 3 5 3
Tim 4 1 2
Stew 5 4 4
I want to create two analysis, first, I want to create a dataframe who has chosen who the most times. (I know that the result depends on if a participant was nominated before and therefore on that day that participant cannot be nominated again, I will handle it later, but for now this is enough)
It should look like this:
Name Favorite
Albert Stew
Josh Veronica
Veronica Tim
Tim Stew
My questions (feel free to answer only one if you can):
1. What code shall I use for it without having to manunally put the names in a different dataframe?
2. How shall I handle a tie, for example Josh chose Veronica and Tim first the same number of times? Later I want to visualise it and I have no idea how to handle ties.
I also would like to analyse the results to visualise strong connections.
Like to show that there are people who usually chose each other, etc.
Is there a good package that is specialised for these? Or how should I get to it?
I do not need DNA sequences, only this simple ones, but I have not found a suitable one yet.
Thanks for your help!
If I am not misunderstanding your problem, here is some code to get the number of occurences of who choose who as next speaker. I added a fourth day to have some count that is not 1. There are ties in the result, choosing the first couple of each group by speaker ('who') may be a solution :
df <- read.table(textConnection(
function (x) {
who <- df$Name[order(df[x],na.last=NA)]
) %>%
replyr::replyr_bind_rows() %>%
filter(! %>%
group_by(who,lead.who.) %>% summarise(n=n()) %>%
Name day1 day2 day3 day4
1 Albert 1 3 1 3
2 Josh 2 2 NA 2
3 Veronica 3 5 3 1
4 Tim 4 1 2 4
5 Stew 5 4 4 5
# A tibble: 12 x 3
# Groups: who [5]
who lead.who. n
<chr> <chr> <int>
1 Albert Tim 2
2 Albert Josh 1
3 Albert Stew 1
4 Josh Albert 2
5 Josh Veronica 1
6 Stew Veronica 1
7 Tim Stew 2
8 Tim Josh 1
9 Tim Veronica 1
10 Veronica Josh 1
11 Veronica Stew 1
12 Veronica Tim 1

How to convert a n x 3 data frame into a square (ordered) matrix?

I need to reshape a table or (data frame) to be able to use an R package (NetworkRiskMetrics). Suppose I have a data frame of lenders, borrowers and loan values:
lender borrower loan_USD
John Mark 100
Mark Paul 45
Joe Paul 30
Dan Mark 120
How do I convert this data frame into:
John Mark Joe Dan Paul
(placing zeros in empty cells)?
Use reshape function
d <- data.frame(lander=c('a','b','c', 'a'), borower=c('m','p','m','p'), loan=c(10,20,15,12))
loan lander borower
10.1 1 a m
20.1 1 b p
15.1 1 c m
12.1 1 a p
reshape(data=d, direction='long', varying=list('lander','borower'), idvar='loan', timevar='loan')
lander borower loan
1 a m 10
2 b p 20
3 c m 15
4 a p 12

Classification according to unique values \

I have a data frame named as Records having 2 vectors Rank and Name
Rank Name
1 Ashish
1 Ashish
2 Ashish
3 Mark
4 Mark
1 Mark
3 Spencer
2 Spencer
1 Spencer
2 Mary
4 Joseph
I want that every name should be placed in either 1, 2 ,3 or 4 tag depending on their occurrence and uniqueness:
I want to create a new vector which will be named as Tagging
So The output should be:
Rank 1 has three unique elements Mark Spencer and Ashish so the tag is 1 for all three.
Rank 2 has one unique records which is Mary as Ashish has already been assigned tag 1 so Mary is tagged as 2.
Rank 3 has no unique records as Spencer and Mark has already been assigned 1 so I cannot tag 3 to anybody.
Rank 4 has one unique record Joseph so he gets tagged as 4.
Let me know which function can help me do this.
I do not want to use looping as this is 1000000 row database
The below solution follows the principle that the highest Rank of a person is going to be that person's tag too.
tbl <- read.table(header=TRUE, text='
Rank Name
1 Ashish
1 Ashish
2 Ashish
3 Mark
4 Mark
1 Mark
3 Spencer
2 Spencer
1 Spencer
2 Mary
4 Joseph
Ordering the 'tbl' dataframe by Rank
tbl_ord <- tbl[with(tbl,order(Rank)),]
Removing multiple occurrence of name within same Rank
> name_ord<- tbl_ord[duplicated(tbl_ord$Rank),]
> name_ord
Rank Name
2 1 Ashish
6 1 Mark
9 1 Spencer
8 2 Spencer
10 2 Mary
7 3 Spencer
11 4 Joseph
Displaying unique Names
#name_ord[unique(name_ord$Name),] #this will work too
> name_ord[!duplicated(name_ord$Name),]
Rank Name
2 1 Ashish
6 1 Mark
9 1 Spencer
10 2 Mary
11 4 Joseph
Using the setkey function of data.table package and unique:
dt<-data.table(Rank=c(1,1,2,3,4,1,3,2,1,2,4), Name=c(rep("Ashish", 3), rep("Mark", 3), rep("Spencer", 3), "Mary", "Joseph"))
setkey(dt, Rank, Name)
setkey(dt, Name)
dt<-unique(dt) # works because of the above setkey call which sorted it
setkey(dt, Rank) # if you want to order them by Rank again

Efficiently joining two data tables with a condition

One data table (let's call is A) contains the ID numbers:
and another table (let's call it B) contains the lower bound and the upper bound and the name for that ID.
ID_lower ID_upper Name
1 4 James
5 7 Arthur
8 11 Jacob
12 13 Sarah
so based on table B, given the ID from table A, we can find the matching name by finding the name on the row in table B such that
ID_lower <= ID <= ID upper
and I wanna create a table of ID and Name, so in the above example, it would be
ID Name
3 James
5 Arthur
12 Sarah
8 Jacob
... ...
I used for loop, so that for each row of A, I look for the row in B such that ID is between the ID_lower and ID_upper for that row and joined the name from there.
However, this method was a bit slow. Is there a fast way of doing it in R?
Using the new non-equi joins feature in the current development version of data.table, this is straightforward:
require(data.table) # v1.9.7+
dt2[dt1, .(ID, Name), on=.(ID_lower <= ID, ID_upper >= ID)]
See the installation instructions for devel version here.
dt2 = fread('ID_lower ID_upper Name
1 4 James
5 7 Arthur
8 11 Jacob
12 13 Sarah')
You can make a look-up table with your second data.frame (B):
lu <-,
data.frame(ID=c(x[1]:x[2]),Name=x[3], row.names = NULL)))
then you query it with your first data.frame (A):
A$Name <- lu[A$ID,"Name"]
You can try this data.table solution:
data.table::setDT(B)[, .(Name, ID = Map(`:`, ID_lower, ID_upper))]
[, .(ID = unlist(ID)), .(Name)][ID %in% A$ID]
Name ID
1: James 3
2: Arthur 5
3: Sarah 12
4: Jacob 8
I believe findInterval() on ID_lower might be the ideal approach here:
## ID Name
## 1: 3 James
## 2: 5 Arthur
## 3: 12 Sarah
## 4: 8 Jacob
This will only be correct if (1) B is sorted by ID_lower and (2) all values in A$ID are covered by the ranges in B.

RPostgreSQL- Insert a column into another table according to the ID

I've used RPostgreSQL to connect R and postgresQL, and I'd like to insert a column into another table according to the "pid", please advise how is can be achieved using R command:
>itemlist<- dbGetQuery(con, "SELECT * from project_budget_itemlist")
pid item cost
1 ABC 9
2 ACB 8
3 BAC 7
3 ZZZ 6
and another tables is as follow:
>name<- dbGetQuery(con, "SELECT * from namelist")
pid name
1 Sally
2 Joy
3 Susan
I want to the result to be:
pid item cost name
1 ABC 9 Sally
2 ACB 8 Joy
3 BAC 7 Susan
3 ZZZ 6 Susan
If there are no matching pids in both outputs, the merge will return an empty data frame. If there are, then this should work:
merge(itemList, name)
