Related
I have three tables Branch, Account_table and customer. I am trying to write a SQL statement for:
At each branch find customers who have the highest balance in their savings account. Displaying their names, the balance, the branch ID and the free overdraft limit in their current accounts.
I have created three tables and insert data:
Branch Table
BID BADDRESS.STREET BADDRESS.CITY BADDRESS.P
---------- -------------------- -------------------- ----------
901 Nicholson Street Edinburgh EH11 5AB
906 East End Garden Glasgow G181QP
912 Fredrick Street London LA112AS
918 Zink Terrace Edinburgh EH149UU
Account_table
ACCNUM ACCTYPE BALANCE BID.BID INRATE LIMITOFFREEOD OPENDATE
------- --------------- ---------- ---------- ---------- ------------- --------
1001 current 820.5 901 .005 800 01-MAY-11
1010 saving 2155 906 .02 0 08-MAR-10
1002 current 2600 912 .005 1000 10-APR-13
1011 saving 4140 918 .02 0 24-OCT-13
Customer Table
CUSTID CADDRESS.STREET CADDRESS.CITY CADDRESS.POSTCODE CNAME.FIRSTNAME CNAME.SURNAME
---------- -------------------- ----------- -------------------- --------------- -----------
1002 Adam Street Edinburgh EH112LQ Jack Smith
1003 Adam Street Edinburgh EH112LQ Anna Smith
1004 New Tweed Edinburgh EH1158L Liam Bain
1005 Dundas Street Edinburgh EH119MN Usman Afaque
1006 St Andres Square Edinburgh EH12LNM Claire Mackintosh
Branch(bID, street, city, p_code, bPhone)
Account(accNum, accType, balance, bID, inRate, limitOfFreeOD, openDate)
Customer(custID, street, city, postCode, title, firstName, surName, custHomePhone,custMobile1, custMobile2, niNum)
Bold is primary key Italic is foreign key (In object relational we don't use Join if I am right).
This what I am trying to do but failed
select c.custid,
(select max(balance) from account_table a
where c.CUSTID = a.bid.bid
and a.acctype='saving' )as highest_saving,
c.cname.firstname,c.CNAME.surname
from customer c;
Any help? Thanks.
You are missing custID column in account table. I have added a few more rows of data to create the test case for your requirement.
drop table acct;
drop table branch;
drop table customer;
create table branch(bid number primary key, addr_street varchar2(100), addr_city varchar2(100), addr_p varchar2(20));
insert into branch values(901,'Nicholson Street','Edinburgh','EH11 5AB');
insert into branch values(906,'East End Garden','Glasgow','G181QP');
insert into branch values(912,'Fredrick Street','London','LA112AS');
insert into branch values(918,'Zink Terrace','Edinburgh','EH149UU');
commit;
select * from branch;
Output:
BID ADDR_STREET ADDR_CITY ADDR_P
901 Nicholson Street Edinburgh EH11 5AB
906 East End Garden Glasgow G181QP
912 Fredrick Street London LA112AS
918 Zink Terrace Edinburgh EH149UU
create table customer(custid number primary key, caddr_street varchar2(100), caddr_city varchar2(100),
caddr_p varchar2(10), fname varchar2(100), lname varchar2(100));
insert into customer values(1002,'Adam Street','Edinburgh','EH112LQ','Jack','Smith');
insert into customer values(1003,'Adam Street','Edinburgh','EH112LQ','Anna','Smith');
insert into customer values(1004,'New Tweed','Edinburgh','EH1158L','Liam','Bain');
insert into customer values(1005,'Dundas Street','Edinburgh','EH119MN','Usman','Afaque');
insert into customer values(1006,'St Andres Square','Edinburgh','EH12LNM','Claire','Mackintosh');
commit;
select * from customer;
Output:
CUSTID CADDR_STREET CADDR_CITY CADDR_P FNAME LNAME
1002 Adam Street Edinburgh EH112LQ Jack Smith
1003 Adam Street Edinburgh EH112LQ Anna Smith
1004 New Tweed Edinburgh EH1158L Liam Bain
1005 Dundas Street Edinburgh EH119MN Usman Afaque
1006 St Andres Square Edinburgh EH12LNM Claire Mackintosh
create table acct(accnum number primary key, acctype varchar2(20), balance number, bid number
constraint acct_fk1 references branch(bid),
inrate number, LIMITOFFREEOD number, OPENDATE date, custid number
constraint acct_fk2 references customer(custid));
insert into acct values(1001,'current',820.5,901,0.005,800,to_date('01-MAY-11','dd-mon-yy'),1002);
insert into acct values(1010,'saving',2155,906,0.02,0,to_date('08-MAR-10','dd-mon-yy'),1002);
insert into acct values(1002,'current',2600,912,0.005,1000,to_date('10-APR-13','dd-mon-yy'),1006);
insert into acct values(1011,'saving',4140,918,0.02,0,to_date('24-OCT-13','dd-mon-yy'),1004);
insert into acct values(1012,'saving',4155,906,0.02,0,to_date('08-MAR-10','dd-mon-yy'),1004);
insert into acct values(1013,'current',2600,918,0.005,1000,to_date('10-APR-13','dd-mon-yy'),1004);
commit;
select * from acct;
Output:
ACCNUM ACCTYPE BALANCE BID INRATE LIMITOFFREEOD OPENDATE CUSTID
1001 current 820.5 901 .005 800 01-MAY-11 1002
1010 saving 2155 906 .02 0 08-MAR-10 1002
1002 current 2600 912 .005 1000 10-APR-13 1006
1011 saving 4140 918 .02 0 24-OCT-13 1004
1012 saving 4155 906 .02 0 08-MAR-10 1004
1013 current 2600 918 .005 1000 10-APR-13 1004
select y.fname, y.lname, y.balance, y.bid,ac.accnum,ac.acctype,ac.LIMITOFFREEOD
from (select *
from (select b.bid, c.custid, a.accnum,a.balance,
row_number() over(partition by b.bid order by a.balance desc) rn,
c.fname, c.lname
from acct a
inner join
branch b
on a.bid = b.bid
inner join
customer c
on a.custid = c.custid
where a.acctype = 'saving') x
where x.rn = 1) y
left join
acct ac
on y.custid = ac.custid
and y.bid = ac.bid
and ac.acctype = 'current';
Output:
FNAME LNAME BALANCE BID ACCNUM ACCTYPE LIMITOFFREEOD
Liam Bain 4140 918 1013 current 1000
Liam Bain 4155 906 NULL NULL NULL
I am trying to run a tutorial for prophet, which uses R magic in a Jupyter notebook. The following code:
%%R
library(prophet)
df <- read.csv('../examples/example_wp_peyton_manning.csv')
df$y <- log(df$y)
m <- prophet(df)
future <- make_future_dataframe(m, periods=366)
Returns this:
Error in library(prophet) : there is no package called ‘prophet’
Then, in my iPython notebook I run this:
from rpy2.robjects.packages import importr
utils = importr('utils')
utils.install_packages('prophet')
Which returns this:
--- Please select a CRAN mirror for use in this session ---
Secure CRAN mirrors
1: 0-Cloud [https] 2: Australia (Canberra) [https]
3: Australia (Melbourne) [https] 4: Australia (Perth) [https]
5: Austria [https] 6: Belgium (Ghent) [https]
7: Brazil (RJ) [https] 8: Brazil (SP 1) [https]
9: Bulgaria [https] 10: Chile 1 [https]
11: China (Lanzhou) [https] 12: Colombia (Cali) [https]
13: Czech Republic [https] 14: Denmark [https]
15: France (Lyon 1) [https] 16: France (Lyon 2) [https]
17: France (Marseille) [https] 18: France (Montpellier) [https]
19: France (Paris 2) [https] 20: Germany (Münster) [https]
21: Iceland [https] 22: Indonesia (Jakarta) [https]
23: Ireland [https] 24: Italy (Padua) [https]
25: Japan (Tokyo) [https] 26: Malaysia [https]
27: Mexico (Mexico City) [https] 28: Norway [https]
29: Philippines [https] 30: Russia (Moscow) [https]
31: Spain (A Coruña) [https] 32: Spain (Madrid) [https]
33: Sweden [https] 34: Switzerland [https]
35: UK (Bristol) [https] 36: UK (Cambridge) [https]
37: UK (London 1) [https] 38: USA (CA 1) [https]
39: USA (KS) [https] 40: USA (MI 1) [https]
41: USA (TN) [https] 42: USA (TX 1) [https]
43: USA (TX 2) [https] 44: (other mirrors)
An input box show up and any selection I make leads to this:
rpy2.rinterface.NULL
I have RStudio, and prophet is up an running w/o problems in R Studio. This is telling me that I have another R kernel running somewhere, linked to the environment in Anaconda, or some other configuration error.
Is there any way to fix this issue so I can run R with the kernel I have in R Studio or force the current R kernel to install prophet?
How do I know the location of the R kernel used by R magic in this Jupyter notebook?
I am using a mac, and I might have some cross-linked files, etc. (My Jupyter notebook shows 6 kernels, when I really have 3..it is repeating what I have twice).
Thanks
You probably have 2 version of R. When you install R kernel from Anaconda, it installs its own version, regardless of what you have in in RStudio. This is what you should do. From a Jupyter notebook, run the following in a cell:
%load_ext rpy2.ipython
Then
%%R
.libPaths()
It should return something like this:
[1] "/Users/user/anaconda/lib/R/library"
Now go to RStudio and run the same line:
.libPaths()
It probably returns something like this:
[1] "/Users/user/Library/R/3.2/library"
[2] "/Library/Frameworks/R.framework/Versions/3.2/Resources/library"
In this example, you can see that one R is in anaconda, and the other one is a stand alone R. The one in your RStudio, the one where you correctly loaded Prophet is the standalone.
The best solution is to have RStudio use the same version that Conda is using. To do that, there are many ways to switch between the two versions, but the best one is to use a simple utility called Rswitch that you can download from here.
RSwitch detects all the versions of R that you have in you computer, and allows your RStudio to switch among the different version of R that you have.
Again, my suggestion is to switch to the version of R that Conda is using, and from RStudio, install your packages to avoid doing it from an Jupyter notebook, which can show errors such as the
rpy2.rinterface.NULL
that you indicated. Hope this works.
Many questions in the question. Answering one of them:
How do I know the location of the R kernel used by R magic in this Jupyter notebook?
In Jupyter, do:
%run -m rpy2.situation
I extract data from elasticsearch as follows:
> packageVersion("elastic") [1] '0.7.8'
# data extract
body <- list(query=list(range=list(timestamp=list(gte="2016-10-13", lte="2016-10-15"))))
b3 <- Search(index="myIndex",
sort=c("timestamp:desc"),
fields=c('timestamp','A','B','C','D','E','F','G'),
body=body,
size=3)
the first and second elements are extracted ok (edited to save space):
$hits$hits[[1]]$fields$F,E,B,G,C,A,D,timestamp
$hits$hits[[2]]$fields$F,E,B,G,C,A,D,timestamp
the third element extracted not fully as:
$hits$hits[[3]]$fields$C,A,B,D,timestamp
==
I convert the list to the data frame as per this post:
Convert in R output of package Elastic (nested list?) to data.frame or JSON
The first and the second elements are loaded perfectly.
The third element is loaded incorrectly since not the full element is extracted, causing the following errors:
# (optional) verify that all hits expand to the same length
# (should be true for data intended to be in a table format)
stopifnot(
sapply(
b3$hits$hits,
function(x) {!(length(unlist(x)) - length(unlist(b3$hits$hits[[1]])))}
)
)
Error: sapply(b3$hits$hits, function(x) { .... are not all TRUE
# load into the dataframe
# count number of columns, use unlist() to convert
# nested lists to a vector, use the first hit as proxy
nColumns <- length(unlist(b3$hits$hits[[1]]))
# fetch column names ... as above
nNames <- names(unlist(b3$hits$hits[[1]]))
# unlist all hits and convert to matrix with ncol Columns, don't forget byrow=TRUE!
df.b3 <- data.frame(matrix(unlist(b3$hits$hits), ncol=nColumns, byrow=TRUE))
Warning message:
In matrix(unlist(b3$hits$hits), ncol = nColumns, byrow = TRUE) :
data length [33] is not a sub-multiple or multiple of the number of columns [12]
>
Note: some records in variables D,E,F,G contain empty (NULL) and '-' values. I suspect this may cause the issue with the extract.
I'd love some feedback if anyone of you encountered a similar issue and found a solution.
Thanks a lot.
Author here of elastic
We don't attempt to coerce output into data.frame's since it can be so variable that we'd likely run into errors often. But we do allow you to pass on an option to jsonlite to coerce to data.frame (via the asdf parameter, for as data.frame) as that shouldn't ever fail.
If dealing with list output, I would use one of dplyr or data.table if getting back a list.
For reproducibility:
library(elastic)
if (!index_exists("shakespeare")) {
shakespeare <- system.file("examples", "shakespeare_data.json", package = "elastic")
docs_bulk(shakespeare)
}
res <- Search(index="shakespeare", fields=c('play_name','speaker'))
out <- lapply(res$hits$hits, function(x) unlist(x$fields, FALSE))
dplyr
library(dplyr)
bind_rows(out)
#> # A tibble: 10 × 2
#> play_name speaker
#> <chr> <chr>
#> 1 Henry IV
#> 2 Henry IV KING HENRY IV
#> 3 Henry IV KING HENRY IV
#> 4 Henry IV KING HENRY IV
#> 5 Henry IV KING HENRY IV
#> 6 Henry IV KING HENRY IV
#> 7 Henry IV KING HENRY IV
#> 8 Henry IV KING HENRY IV
#> 9 Henry IV WESTMORELAND
#> 10 Henry IV WESTMORELAND
data.table
library(data.table)
rbindlist(out, fill = TRUE, use.names = TRUE)
#> play_name speaker
#> 1: Henry IV
#> 2: Henry IV KING HENRY IV
#> 3: Henry IV KING HENRY IV
#> 4: Henry IV KING HENRY IV
#> 5: Henry IV KING HENRY IV
#> 6: Henry IV KING HENRY IV
#> 7: Henry IV KING HENRY IV
#> 8: Henry IV KING HENRY IV
#> 9: Henry IV WESTMORELAND
#> 10: Henry IV WESTMORELAND
Or, use asdf parameter, which internally directs jsonlite::fromJSON to parse to a data.frame if possible.
res <- Search(index="shakespeare", fields=c('play_name','speaker'), asdf = TRUE)
res$hits$hits$fields
#> play_name speaker
#> 1 Henry IV
#> 2 Henry IV KING HENRY IV
#> 3 Henry IV KING HENRY IV
#> 4 Henry IV KING HENRY IV
#> 5 Henry IV KING HENRY IV
#> 6 Henry IV KING HENRY IV
#> 7 Henry IV KING HENRY IV
#> 8 Henry IV KING HENRY IV
#> 9 Henry IV WESTMORELAND
#> 10 Henry IV WESTMORELAND
Using:
R v3.3.2
OSX
elastic v0.7.8.9000
Elasticsearch v2.3.4
I have the following dataframe, main_df.
structure(list(Id = c(190150L, 243744L, 204796L, 139630L, 156541L,
157377L, 225627L), Name = c("Columbia University in the City of New York",
"Stanford University", "Ohio State University-Main Campus", "Emmanuel College",
"University of the Cumberlands", "Midway University", "University of the Incarnate Word"
), desired_sport = c("Archery", "Synchronized Swimming", "Synchronized Swimming",
"Archery", "Archery", "Archery", "Synchronized Swimming"), academic_strength = c("elite",
"elite", "average", "weak", "weak", "weak", "weak")), .Names = c("Id",
"Name", "desired_sport", "academic_strength"), class = "data.frame", row.names = c(1L,
258L, 1043L, 1144L, 1145L, 1146L, 1500L))
I need to have different desired sports and different levels of academic strengths.
At minimum, I need a dataframe that has at least 3 rows in each of the combinations of desired_sport and academic_strength.
In order to find that, I created a separate df to see how many were present in each combination
aggregate_test_df <- aggregate(Id ~ desired_sport + academic_strength, main_df, length)
I then created a new df with the maximum combinations that I needed and thought I could "cbind" the extra columns and then fill in the remainder.
The new combination_test_df was created as follows:
academic_strength <- c("elite", "strong", "average", "weak")
sport_test <- c("Archery", "Synchronized Swimming")
combination_test_df <- expand.grid(sport_test, academic_strength)
i <- sapply(combination_test_df, is.factor)
combination_test_df[i] <- lapply(combination_test_df[i], as.character)
combination_test_df$count <- 3
combination_test_df2 <- expandRows(combination_test_df, "count")
And got stuck in that I could not merge or cbind without creating more combinations.
The desired output would be a dataframe with each "desired_sport", "academic_strength" combination 3 times, some will be NA and some will be filled in, but that will allow me to create rules to fill in the NA's for the "name" and the "Id" columns.
Output would look like a dataframe similar to this:
Id Name desired_sport academic_strength
190150 Columbia University in the City of New York Archery elite
NA NA Archery elite
NA NA Archery elite
NA NA Archery strong
NA NA Archery strong
NA NA Archery strong
NA NA Archery average
NA NA Archery average
NA NA Archery average
139630 Emanuel College Archery weak
156541 University of the Cumberlands Archery weak
157377 Midway University Archery weak
243744 Stanford University Synchronized Swimming elite
NA NA Synchronized Swimming elite
NA NA Synchronized Swimming elite
NA NA Synchronized Swimming strong
NA NA Synchronized Swimming strong
NA NA Synchronized Swimming strong
204796 Ohio State University - Main Campus Synchronized Swimming average
NA NA Synchronized Swimming average
NA NA Synchronized Swimming average
NA NA Synchronized Swimming weak
NA NA Synchronized Swimming weak
NA NA Synchronized Swimming weak
And then i would actually like to be able to fill in- so the complete final dataframe
Id Name desired_sport academic_strength
190150 Columbia University in the City of New York Archery elite
139630 Emanuel College Archery elite
156541 University of the Cumberlands Archery elite
139630 Emanuel College Archery strong
156541 University of the Cumberlands Archery strong
157377 Midway University Archery strong
139630 Emanuel College Archery average
156541 University of the Cumberlands Archery average
157377 Midway University Archery average
139630 Emanuel College Archery weak
156541 University of the Cumberlands Archery weak
157377 Midway University Archery weak
243744 Stanford University Synchronized Swimming elite
204796 Ohio State University - Main Campus Synchronized Swimming elite
NA NA Synchronized Swimming elite
204796 Ohio State University - Main Campus Synchronized Swimming strong
NA NA Synchronized Swimming strong
NA NA Synchronized Swimming strong
204796 Ohio State University - Main Campus Synchronized Swimming average
NA NA Synchronized Swimming average
NA NA Synchronized Swimming average
NA NA Synchronized Swimming weak
NA NA Synchronized Swimming weak
NA NA Synchronized Swimming weak
Any advice?
I installed R-3.2.2 from the source (./configure, ./make, ./make install). It works perfectly fine but when I try to install any package from any repository, I get the following message:
> install.packages("igraph")
Installing package into ‘/home/jonathan/R/x86_64-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---
Error in download.file(url, destfile = f, quiet = TRUE) :
unsupported URL scheme
HTTPS CRAN mirror
1: 0-Cloud [https] 2: Austria [https]
3: China (Beijing 4) [https] 4: China (Hefei) [https]
5: Colombia (Cali) [https] 6: France (Lyon 2) [https]
7: Iceland [https] 8: Russia (Moscow 1) [https]
9: Switzerland [https] 10: UK (Bristol) [https]
11: UK (Cambridge) [https] 12: USA (CA 1) [https]
13: USA (KS) [https] 14: USA (MI 1) [https]
15: USA (TN) [https] 16: USA (TX) [https]
17: USA (WA) [https] 18: (HTTP mirrors)
Selection: 10
Warning: unable to access index for repository https://www.stats.bris.ac.uk/R/src/contrib
Warning message:
package ‘igraph’ is not available (for R version 3.2.2)
I'm not using any proxy and I tried doing what is said here - I've installed build-essentials and r-base-dev with apt-get, but still, the error continues.
What is strange though is that with RStudio on the same machine, the download of packages works fine, the problem occurs only when I use R from the command line.
The mirror you chose is a https mirror. You need to have setup a secure connection in order to use https mirrors.
Select 18 (HTTP mirrors) and you will see a list of additional mirrors. Pick one of those
Alternatively; you can use chooseCRANmirror():
> chooseCRANmirror()
HTTPS CRAN mirror
1: 0-Cloud [https] 2: Austria [https]
3: Chile [https] 4: China (Beijing 4) [https]
5: Colombia (Cali) [https] 6: France (Lyon 2) [https]
7: Germany (Münster) [https] 8: Iceland [https]
9: Russia (Moscow) [https] 10: Spain (A Coruña) [https]
11: Switzerland [https] 12: UK (Bristol) [https]
13: UK (Cambridge) [https] 14: USA (CA 1) [https]
15: USA (KS) [https] 16: USA (MI 1) [https]
17: USA (TN) [https] 18: USA (TX) [https]
19: USA (WA) [https] 20: (HTTP mirrors)
Selection: 20
HTTP CRAN mirror
1: 0-Cloud 2: Algeria
3: Argentina (La Plata) 4: Australia (Canberra)
5: Australia (Melbourne) 6: Austria
7: Belgium (Antwerp) 8: Belgium (Ghent)
-------------------------------------------------------------
87: USA (MI 1) 88: USA (MI 2)
89: USA (MO) 90: USA (NC)
91: USA (OH 1) 92: USA (OH 2)
93: USA (OR) 94: USA (PA 1)
95: USA (PA 2) 96: USA (TN)
97: USA (TX) 98: USA (WA)
99: Venezuela 100: Vietnam
Selection: 56
>
I realise this is almost 2 years later but I couldn't find an answer so have added my solution here.
I came up against the same problem. Couldn't download from https but could from http on MacOS Sierra 10.12.6 with R version 3.4.1 and libcurl 7.55.1 with https support. The problem for me was I had no certificates for https. I downloaded the file from https://raw.githubusercontent.com/bagder/ca-bundle/master/ca-bundle.crt and set the environment variable CURL_CA_BUNDLE to the full path to ca-bundle.crt and this worked.