OTA_AirRulesLLS returns DuplicateFareInfo instead of FareRuleInfo - sabre

Occasionally, the OTA_AirRulesLLS service returns DuplicateFareInfo element instead of FareRuleInfo element. What causes this and how do we fetch the actual fare rules when this happens?
Request
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:mes="http://www.ebxml.org/namespaces/messageHeader" xmlns:ns="http://webservices.sabre.com/sabreXML/2011/10" xmlns:sec="http://schemas.xmlsoap.org/ws/2002/12/secext">
<soapenv:Header>
<sec:Security>
<sec:BinarySecurityToken>Shared/IDL:IceSess\/SessMgr:1\.0.IDL/Common/!ICESMS\/RESG!ICESMSLB\/RES.LB!1667377763911!3333!305</sec:BinarySecurityToken>
</sec:Security>
<mes:MessageHeader>
<mes:ConversationId>166737776258431097</mes:ConversationId>
<mes:To>
<mes:PartyId>WS_Server</mes:PartyId>
</mes:To>
<mes:From>
<mes:PartyId>WS_Client</mes:PartyId>
</mes:From>
<mes:Service>OTA_AirRulesLLSRQ</mes:Service>
<mes:Action>OTA_AirRulesLLSRQ</mes:Action>
<mes:MessageData>
<mes:MessageId>166737776408193621</mes:MessageId>
<mes:Timestamp>2022-11-02T08:29:24+0000</mes:Timestamp>
</mes:MessageData>
<mes:CPAId>56QD</mes:CPAId>
</mes:MessageHeader>
</soapenv:Header>
<soapenv:Body>
<ns:OTA_AirRulesRQ Version="2.3.0">
<ns:OriginDestinationInformation>
<ns:FlightSegment>
<ns:DestinationLocation LocationCode="ZRH" />
<ns:MarketingCarrier Code="QR" />
<ns:OriginLocation LocationCode="SIN" />
</ns:FlightSegment>
</ns:OriginDestinationInformation>
<ns:RuleReqInfo>
<ns:Category>0</ns:Category>
<ns:Category>1</ns:Category>
<ns:Category>2</ns:Category>
<ns:Category>3</ns:Category>
<ns:Category>4</ns:Category>
<ns:Category>5</ns:Category>
<ns:Category>6</ns:Category>
<ns:Category>7</ns:Category>
<ns:Category>8</ns:Category>
<ns:Category>9</ns:Category>
<ns:Category>10</ns:Category>
<ns:Category>11</ns:Category>
<ns:Category>12</ns:Category>
<ns:Category>13</ns:Category>
<ns:Category>14</ns:Category>
<ns:Category>15</ns:Category>
<ns:Category>16</ns:Category>
<ns:Category>17</ns:Category>
<ns:Category>18</ns:Category>
<ns:Category>19</ns:Category>
<ns:Category>20</ns:Category>
<ns:Category>21</ns:Category>
<ns:Category>22</ns:Category>
<ns:Category>23</ns:Category>
<ns:Category>25</ns:Category>
<ns:Category>26</ns:Category>
<ns:Category>27</ns:Category>
<ns:Category>28</ns:Category>
<ns:Category>29</ns:Category>
<ns:Category>31</ns:Category>
<ns:Category>35</ns:Category>
<ns:Category>50</ns:Category>
<ns:FareBasis Code="MJR9R1SW" />
</ns:RuleReqInfo>
</ns:OTA_AirRulesRQ>
</soapenv:Body>
</soapenv:Envelope>
Response
<?xml version="1.0" encoding="UTF-8"?>
<soap-env:Envelope xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/"><soap-env:Header><eb:MessageHeader xmlns:eb="http://www.ebxml.org/namespaces/messageHeader" eb:version="1.0" soap-env:mustUnderstand="1"><eb:From><eb:PartyId eb:type="URI">WS_Server</eb:PartyId></eb:From><eb:To><eb:PartyId eb:type="URI">WS_Client</eb:PartyId></eb:To><eb:CPAId>56QD</eb:CPAId><eb:ConversationId>166737776258431097</eb:ConversationId><eb:Service>OTA_AirRulesLLSRQ</eb:Service><eb:Action>OTA_AirRulesLLSRS</eb:Action><eb:MessageData><eb:MessageId>1257113305651560914</eb:MessageId><eb:Timestamp>2022-11-02T08:29:25</eb:Timestamp><eb:RefToMessageId>166737776408193621</eb:RefToMessageId></eb:MessageData></eb:MessageHeader><wsse:Security xmlns:wsse="http://schemas.xmlsoap.org/ws/2002/12/secext"><wsse:BinarySecurityToken valueType="String" EncodingType="wsse:Base64Binary">Shared/IDL:IceSess\/SessMgr:1\.0.IDL/Common/!ICESMS\/RESG!ICESMSLB\/RES.LB!1667377763911!3333!305</wsse:BinarySecurityToken></wsse:Security></soap-env:Header><soap-env:Body><OTA_AirRulesRS xmlns="http://webservices.sabre.com/sabreXML/2011/10" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stl="http://services.sabre.com/STL/v01" Version="2.3.0">
<stl:ApplicationResults status="Complete">
<stl:Success timeStamp="2022-11-02T03:29:25-05:00"/>
</stl:ApplicationResults>
<DuplicateFareInfo>
<Text>SIN-ZRH CXR-QR WED 02NOV22 SGD
EY 0/ 0/ 1 TG 0/ 0/ 2 EK 0/ 0/ 2 SQ 0/ 0/ 2 BA 0/ 0/ 5
LH 0/ 0/14 LX 0/ 0/ 1 QR 0/ 0/ 2 AY 0/ 0/ 2 AF 0/ 0/ 3
QF 0/ 0/ 2 TK 0/ 0/ 3 NH 0/ 0/ 3
//SEE FQHELP FOR INFORMATION ABOUT THE NEW FARE DISPLAYS//
SURCHARGE FOR PAPER TICKET MAY BE ADDED WHEN ITIN PRICED
QR-QRY/ECONVENIEN - ECONOMY CONVENIENCE
QR SINZRH.EH 02NOV22 MPM 8011
V FARE BASIS AP FARE-OW FARE-RT BK SEASON MINMAX RTG
1 MJR9R1SW - 1296.00 M --- -/12M EH01
2 MJR9R1SW - 1439.00 M --- -/12M EH02
3 MJR9R1SW - 1465.00 M --- -/12M EH03
4 MJR9R1SW - 1497.00 M --- -/12M EH04
5 MJR9R1SW - 1582.00 M --- -/12M EH05
6 MJR9R1SW - 1606.00 M --- -/12M EH06
7 MJR9R1SW - 1606.00 M --- -/12M EH07
8 MJR9R1SW - 1697.00 M --- -/12M EH08
9 MJR9R1SW - 1697.00 M --- -/12M EH09
10 MJR9R1SW - 1718.00 M --- -/12M EH10
11 MJR9R1SW - 1768.00 M --- -/12M EH11
12 MJR9R1SW - 1768.00 M --- -/12M EH12
13 MJR9R1SW - 1801.00 M --- -/12M EH13
14 MJR9R1SW - 1802.00 M --- -/12M EH14
15 MJR9R1SW - 1858.00 M --- -/12M EH15
16 MJR9R1SW - 1940.00 M --- -/12M EH16
17 MJR9R1SW - 1954.00 M --- -/12M EH17
EH01* /WITHIN THE EASTERN HEMISPHERE/ PUBLISHED RTG 2
1. SIN-DOH-ZRH
EH02* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-DOH-ZRH
2. SIN-MH/QR-KUL-DOH-ZRH
3. SIN-MH/QR-KUL-MH/QR-DOH-ZRH
4. SIN-MH/QR-KUL-QR-DOH-ZRH
EH03* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-DOH-BA/QR-LON-BA/QR-ZRH
2. SIN-DOH-ZRH
EH04* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-DOH-GVA-9B-ZRH
2. SIN-DOH-ZRH
EH05* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA HKT
1. SIN-3K-HKT-DOH-ZRH
2. SIN-3K-HKT-QR-DOH-ZRH
3. SIN-DOH-ZRH
EH06* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA BCN
1. SIN-DOH-BCN-QR/VY-ZRH
2. SIN-DOH-QR-BCN-QR/VY-ZRH
3. SIN-DOH-ZRH
EH07* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA VIE
1. SIN-DOH-VIE-OS-ZRH
2. SIN-DOH-ZRH
EH08* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-3K-HKG-DOH-ZRH
2. SIN-3K-HKG-QR-DOH-ZRH
3. SIN-DOH-ZRH
EH09* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-3K-MNL-DOH-ZRH
2. SIN-3K-MNL-QR-DOH-ZRH
3. SIN-DOH-ZRH
EH10* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA MUC
1. SIN-DOH-MUC-LH-ZRH
2. SIN-DOH-ZRH
EH11* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-BG-DAC-DOH-ZRH
2. SIN-DOH-ZRH
EH12* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-CX/QR-HKG-DOH-ZRH
2. SIN-CX/QR-HKG-QR-DOH-ZRH
3. SIN-DOH-ZRH
EH13* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-DOH-BEG-JU-ZRH
2. SIN-DOH-BUH-SOF-FB-ZRH
3. SIN-DOH-HEL-AY/QR-ZRH
4. SIN-DOH-OSL-SK-ZRH
5. SIN-DOH-QR-ZRH
6. SIN-DOH-SKG-A3-ZRH
7. SIN-DOH-SOF-FB-ZRH
8. SIN-DOH-VIE-OS-ZRH
9. SIN-DOH-ZAG-OU-ZRH
10. SIN-DOH-ZRH
EH14* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA DUB
1. SIN-DOH-DUB-EI-ZRH
2. SIN-DOH-QR-DUB-EI-ZRH
3. SIN-DOH-ZRH
EH15* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA FRA
1. SIN-DOH-FRA-LH-ZRH
2. SIN-DOH-ZRH
EH16* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
TRAVEL MUST BE VIA CMB
1. SIN-3K-CMB-BA/QR-DOH-ZRH
2. SIN-3K-CMB-QR-DOH-ZRH
3. SIN-3K-CMB-UL-MLE-DOH-ZRH
4. SIN-DOH-ZRH
5. SIN-UL-CMB-BA/QR-DOH-ZRH
6. SIN-UL-CMB-QR-DOH-ZRH
7. SIN-UL-CMB-UL-MLE-DOH-ZRH
EH17* /WITHIN THE EASTERN HEMISPHERE/ CONSTRUCTED RTG
1. SIN-DOH-ZRH
2. SIN-QR-DOH-ZRH
3. SIN-VN-SGN-DOH-ZRH
4. SIN-VN-SGN-QR-DOH-ZRH
.</Text>
</DuplicateFareInfo>
</OTA_AirRulesRS></soap-env:Body></soap-env:Envelope>
Please ignore this paragraph. This paragraph is required as stackoverflow thinks that my questions is mostly code and wants me to add more details. However, there isn't anymore details to add.

Related

Make HTML page (text) suitable for text analysis in R

I would like to do some text analytics on text from following web page:
https://narodne-novine.nn.hr/clanci/sluzbeni/full/2007_07_79_2491.html
I don't know how to convert this HTML to tidy text object (every row in text is every row in dataframe).
For example, just applying html_text() function doesn't help:
url <- "https://narodne-novine.nn.hr/clanci/sluzbeni/full/2007_07_79_2491.html"
p <- rvest::read_html(url, encoding = "UTF-8") %>%
rvest::html_text()
p
since I don't have separated rows.
That site has some very well-structured HTML with the headers and the body text of the section each given their own align attributes. We can use that to extract your text by section:
library(rvest)
library(tidyverse)
pg <- read_html("https://narodne-novine.nn.hr/clanci/sluzbeni/full/2007_07_79_2491.html")
html_nodes(pg, xpath = ".//p[#align='center']/following-sibling::p[#align='justify']") %>%
map_df(~{
data_frame(
section = html_node(.x, xpath=".//preceding-sibling::p[#align='center'][1]") %>%
html_text(trim=TRUE),
section_text = html_text(.x, trim=TRUE)
)
})
## # A tibble: 38 x 2
## section section_text
## <chr> <chr>
## 1 Članak 1. "U Zakonu o autorskom pravu i srodnim pravima (»Narodne novine«, br. 167/03.) u \r\nčlanku 4. sta…
## 2 Članak 2. "U članku 8. stavku 2. točki 1. riječ: »standardi« briše se.\r\nU stavku 3. druga rečenica mijenj…
## 3 Članak 3. "U članku 20. stavku 2. riječi: »na području Republike Hrvatske« zamjenjuju se \r\nriječima: »na …
## 4 Članak 4. "U članku 32. stavku 5. točki 1. i 3. riječ: »naprava« zamjenjuje se riječju: \r\n»uređaja«.\r\nU…
## 5 Članak 5. U članku 39. stavku 1. riječi: »stavka 1.« brišu se.
## 6 Članak 6. "U članku 44. stavku 5. dodaje se rečenica koja glasi:\r\n»U slučaju sumnje, u drugim slučajevima…
## 7 Članak 7. "U članku 52. stavku 3. riječ: »korištenja« zamjenjuje se riječju: \r\n»iskorištavanja«."
## 8 Članak 8. U članku 86. iza riječi: »koji je« dodaje se riječ: »u«.
## 9 Članak 9. "U članku 98. u stavku 1. riječ: »tehnoloških« zamjenjuje se riječju: \r\n»tehničkih«.\r\nStavak …
## 10 Članak 10. "U članku 109. dodaje se stavak 3. koji glasi:\r\n»(3) Odredbe iz članka 20. ovoga Zakona o iscrp…
## # ... with 28 more rows
You'll need to double check that the above didn't miss anything. Even if it did it should be straightforward to expand upon the answer.
You can get individual lines broken out using the above as well:
html_nodes(pg, xpath = ".//p[#align='center']/following-sibling::p[#align='justify']") %>%
map_df(~{
data_frame(
section = html_node(.x, xpath=".//preceding-sibling::p[#align='center'][1]") %>%
html_text(trim=TRUE),
section_text = html_text(.x, trim=TRUE)
)
}) %>%
mutate(section_text = stri_split_lines(section_text)) %>%
unnest(section_text)
## # A tibble: 334 x 2
## section section_text
## <chr> <chr>
## 1 Članak 1. "U Zakonu o autorskom pravu i srodnim pravima (»Narodne novine«, br. 167/03.) u "
## 2 Članak 1. članku 4. stavak 2. mijenja se i glasi:
## 3 Članak 1. "»(2) Odredbe iz ovoga Zakona o definicijama pojedinih autorskih imovinskih "
## 4 Članak 1. "prava, o pravu na naknadu za reproduciranje autorskog djela za privatno ili "
## 5 Članak 1. "drugo vlastito korištenje, o pravu na naknadu za javnu posudbu, kao i o "
## 6 Članak 1. "iscrpljenju prava distribucije, iznimkama i ograničenjima autorskih prava, "
## 7 Članak 1. "početku tijeka i učincima isteka rokova trajanja autorskog prava, autorskom "
## 8 Članak 1. "pravu u pravnom prometu te o odnosu autorskog prava i prava vlasništva "
## 9 Članak 1. "primjenjuju se na odgovarajući način i za srodna prava, ako za njih nije što "
## 10 Članak 1. posebno određeno ili ne proizlazi iz njihove pravne naravi.«
## # ... with 324 more rows
The tidytext package has examples of how to perform further cleanup transformations to facilitate text mining.

Extract specific observations of csv file in R

I've imported a csv file using read.csv.
It gives me a data frame with 18k observations of 1 variable, which looks like this:
V1
1 Energies (kJ/mol)
2 Bond Angle Proper Dih. Improper Dih. LJ-14
3 3.12912e+04 4.12307e+03 1.63677e+04 1.25619e+02 1.04394e+04
4 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
5 9.21339e+04 2.82339e+05 -1.15807e+06 -7.21252e+05 -7.25781e+03
6 Step Time Lambda
7 1 1.00000 0.00000
8 Energies (kJ/mol)
9 Bond Angle Proper Dih. Improper Dih. LJ-14
10 2.71553e+04 4.11858e+03 1.63855e+04 1.22226e+02 1.03903e+04
11 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
12 9.20926e+04 2.65253e+05 -1.15928e+06 -7.43766e+05 -7.27887e+03
13 Step Time Lambda
14 2 2.00000 0.00000
...
I want to extract the Potential energy in a vector. I've tried grep and readLines in multiple varieties and functions, but nothing works. Does anybody have an idea how to solve this problem?
Thanks! :)
So is this the right answer (from a former fizzsics major):
Lines <- readLines(textConnection("1 Energies (kJ/mol)
2 Bond Angle Proper Dih. Improper Dih. LJ-14
3 3.12912e+04 4.12307e+03 1.63677e+04 1.25619e+02 1.04394e+04
4 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
5 9.21339e+04 2.82339e+05 -1.15807e+06 -7.21252e+05 -7.25781e+03
6 Step Time Lambda
7 1 1.00000 0.00000
8 Energies (kJ/mol)
9 Bond Angle Proper Dih. Improper Dih. LJ-14
10 2.71553e+04 4.11858e+03 1.63855e+04 1.22226e+02 1.03903e+04
11 Coulomb-14 LJ (SR) Coulomb (SR) Potential Pressure (bar)
12 9.20926e+04 2.65253e+05 -1.15928e+06 -7.43766e+05 -7.27887e+03
13 Step Time Lambda
14 2 2.00000 0.00000"))
> grep("Potential", Lines) # identify the lines with "Potential"
[1] 4 11
Need to move to the next line and get the 5th item:
> read.table(text=Lines[ grep("Potential", Lines)+1])[ , 5]
[1] -721252 -743766

How to speed up the loop running process in R for huge files

I have a genetics sequencing file - 4 millon rows. I'm trying to run a piece of code for variants each unique gene listed.
Here is an exmple of how the data is
CHROM POS GENE IMPACT HOM
1 23455 A HIGH HET
1 23675 A HIGH HET
1 23895 A MODERATE
1 24115 B LOW HET
1 24335 B HIGH HET
1 24555 B LOW HET
2 6789 C LOW
2 12346 C LOW HET
2 17903 C MODERATE HET
2 23460 C MODERATE
2 29017 D LOW HET
2 34574 D HIGH
2 40131 D HIGH HET
3 567890 E HIGH HET
3 589076 E HIGH
3 610262 E LOW HET
3 631448 F HIGH HET
3 652634 F MODERATE HET
And here is my code:
sam <- read.csv("../sample/sample1.txt", sep="\t",header=TRUE,stringsAsFactors=FALSE)
glist <- unique(sam[,3])
for(i in glist) {
lice <- subset(sam, GENE == i)
lice$mut <- as.numeric(ifelse((lice[c(4)] == 'MODERATE' | lice[c(4)] == 'HIGH'), c(1), c(0)))
lice$cntmut <- sum(lice$mut, na.rm=TRUE)
lice$het <- as.numeric(ifelse(lice$HOM == 'HET', c(1), c(0)))
lice$cnthet <- sum(lice$het, na.rm=TRUE)
lice$cnthetmut <- lice$mut + lice$het
lice$lice <- ifelse(lice$mut == 1 & lice$cntmut >= 2 & lice$het == 1 & lice$cnthet >= 2 & lice$cnthetmut == 2 , 'lice', '')
write.table(lice,paste0("../sample/list/",i,".txt"),sep="\t",quote=F,row.names=F)
}
licelist <- list.files("../sample/list/", full.names=T)
lice2 <- do.call("rbind",lapply(licelist, FUN=function(files){read.table(files, header=TRUE, sep="\t", stringsAsFactors=FALSE)}))
lice_out <- merge(sam,lice2,by.x=c("CHROM","POS"),by.y=c("CHROM","POS"), all=T)
write.table(lice_out,"../sample/sample1_lice.txt",sep="\t",quote=F,row.names=F)
I have 30,000 genes which means running this code will take about 2 weeks (the original file is about 4GB in size). I was wondering whether anyone had any advice on how to speed this up? I've tried writing a function to include all this info (which some is repetitive) but to no avail.
Just to add:
The code in the loop is essentially doing the following:
1. adding up how many variants in each gene are moderate or high and how many are are het.
2. lice is given to a variant in a gene if the variant is moderate/high, is a het, and only if there are more than two of these types variants in the gene
For this result:
CHROM POS GENE IMPACT HOM LICE
1 23455 A HIGH HET lice
1 23675 A HIGH HET lice
1 23895 A MODERATE
1 24115 B LOW HET
1 24335 B HIGH HET
1 24555 B LOW HET
2 6789 C LOW
2 12346 C LOW HET
2 17903 C MODERATE HET
2 23460 C MODERATE
2 29017 D LOW HET
2 34574 D HIGH
2 40131 D HIGH HET
3 567890 E HIGH HET
3 589076 E HIGH
3 610262 E LOW HET
3 631448 F HIGH HET lice
3 652634 F MODERATE HET lice
Like I mentioned a bit further up, the steps are not all necessary but worked at the time when I was doing it on a smaller datat frame.
Ir's a bit difficult to help you when you don't explain what you are trying to accomplish, or provide an example of what the desired result looks like with your sample dataset, but here are a few suggestions:
(1) Use data tables. They a much faster and use memory much more efficiently.
(2) Other than the sums (cntmut, cnthet) I don't see why you spit the original table. There are other ways to get the sums without splitting the dataset.
(3) I don't really see the point of the merge at the end.
Here's an option that will likely be much faster.
library(data.table)
dt <- data.table(sam)
setkey(dt,GENE)
dt[,mut:=as.numeric(IMPACT=="MODERATE"|IMPACT=="HIGH")]
dt[,cntmut:=sum(mut), by=GENE]
dt[,het:=as.numeric(HOM=="HET")]
dt[,cnthet:=sum(het),by=GENE]
dt[,cnthetmut:=mut+het]
dt[,lice:=ifelse(mut==1 & cntmut>=2 & het==1 & cnthetmut ==2,'lice',''), by=GENE]
head(dt)
# CHROM POS GENE IMPACT HOM mut cntmut het cnthet cnthetmut lice
# 1: 1 23455 A HIGH HET 1 3 1 2 2 lice
# 2: 1 23675 A HIGH HET 1 3 1 2 2 lice
# 3: 1 23895 A MODERATE 1 3 0 2 1
# 4: 1 24115 B LOW HET 0 1 1 3 1
# 5: 1 24335 B HIGH HET 1 1 1 3 2
# 6: 1 24555 B LOW HET 0 1 1 3 1

providing date in x-axis in R

Can anyone please tell me how to give the start and end as dates in time-series in R. I know how to give a sequence, say:
ts <- ts(temp, start=1, end=10)
But I want to show starting at Jan 01 and ending in Jan 10 instead of just 1 to 10. Thanks in advance.
The basic time-series functionality in ts is probably not going to be enough for you. There are a lot of available tools for working with time-series in R, but the ts class is geared towards representing
"regularly spaced time series (using numeric time stamps). Hence, it is
particularly well-suited for annual, monthly, quarterly data, etc"
If you describe your data correctly, the print command will format it nicely. If you wanted your data divided into months, you could do something like this (note the frequency of 12):
> print(ts(round(rnorm(44)), start = c(2012,3), frequency = 12), calendar = TRUE)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2012 2 1 -2 -1 1 1 0 -1 1 -1
2013 1 0 2 1 -1 1 -1 -1 0 2 1 2
2014 2 0 -1 0 1 1 0 0 2 -1 0 0
2015 2 0 0 -1 0 0 0 1 2 0
Since you want daily intervals, you're going to want to set frequency to 365:
> print(ts(letters, start = c(2013, 1), frequency = 365), calendar = TRUE)
p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 p20 p21
2013 a b c d e f g h i j k l m n o p q r s t u
p22 p23 p24 p25 p26
2013 v w x y z
Which is going to look rather awkward, but will solve your problem in that it won't just give you a number for each day. However, as stated in the docs, the ts class only supports "numeric time stamps" so this is probably the best you're going to get with built in ts features.
If you want more advanced features I would have a look at some of the tools in this documentation.

sql query help for the table below

my table structure is :
table_system:
"ID" NUMBER NOT NULL ENABLE,
"COUNTRY" VARCHAR2(10 BYTE) NOT NULL ENABLE,
"COMPANYCODE" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"SYSTEM" VARCHAR2(50 BYTE) NOT NULL ENABLE,
"NOTSTARTED" NUMBER,
"RUNNING" NUMBER,
"COMPLETED" NUMBER,
"ACTUALSTARTTIME" VARCHAR2(5 BYTE),
"ACTUALENDTIME" VARCHAR2(5 BYTE),
"SEQUENCE" NUMBER,
"PLANNEDSTARTTIME" VARCHAR2(5 BYTE),
"PLANNEDENDTIME" VARCHAR2(5 BYTE),
"ESTIMATEDENDTIME" VARCHAR2(5 BYTE),
CONSTRAINT "SYSTEMRUNTIME_PK" PRIMARY KEY ("ID", "COUNTRY", "COMPANYCODE", "SYSTEM") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "SYSTEM" ENABLE
I need an output that will fetch me the following output:
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8 … SYSTEM N
-------------------------------------------------- --------------------------- ------------------- ----------------------- ------------------------------ ------------------------------ -------------------- -------------- -------------- -------------- --------------
where systems are sorted as per the "SEQUENCE" attribute.
I have tried this query :
select distinct companycode, sequence, system,notstarted,running,completed
from table_system
where id = (select max(id) from table_system)
order by companycode, sequence
this fetches me the following
COMPANYCODE SEQUENCE SYSTEM NOTSTARTED RUNNING COMPLETED
-------------------------------------------------- ---------------------- -------------------------------------------------- ---------------------- ---------------------- ----------------------
1001 Helsinki Branch 1 GAP 2 / Datastage GL 0 0 3
1001 Helsinki Branch 2 SAP GL 0 0 2
1001 Helsinki Branch 3 SAP BW 0 0 2
1002 Copenhagen Branch 1 GAP 2 / Datastage GL 0 0 3
1002 Copenhagen Branch 2 SAP GL 0 0 2
1002 Copenhagen Branch 3 SAP BW 0 0 2
1003 Oslo Branch 1 GAP 2 / Datastage GL 0 0 3
1003 Oslo Branch 2 SAP GL 0 0 2
1003 Oslo Branch 3 SAP BW 0 0 2
1004 (publ) (EUR) 1 EKO 0 0 13
1004 (publ) (EUR) 2 HA Core 0 0 6
1004 (publ) (EUR) 3 HA Post Processor 0 0 5
1004 (publ) (EUR) 4 Datastage GL 3 0 10
1004 (publ) (EUR) 5 Datastage Recon 1 0 3
1004 (publ) (EUR) 11 SAP GL 0 0 4
1004 (publ) (EUR) 21 SAP BW 0 0 4
but I want the output to be :
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8 … SYSTEM N
-------------------------------------------------- --------------------------- ------------------- ----------------------- ------------------------------ ------------------------------ -------------------- -------------- -------------- -------------- --------------
1001 Helsinki Branch GAP 2 / Datastage GL SAP GL SAP BW
1002 Copenhagen Branch GAP 2 / Datastage GL SAP GL SAP BW
1003 Oslo Branch GAP 2 / Datastage GL SAP GL SAP BW
1004 (publ) (EUR) EKO HA Core HA Post Processor Datastage GL Datastage Recon SAP GL SAP BW
Any hint for the above will be highly appreciated.
Thank You
vinayak
Try that:
select companycode, COLLECT(system) as systems
from table_system
where id = (select max(id) from table_system)
group by companycode
order by companycode, sequence
You can use a pivot operation for this; but can't have an unknown number of systems to handle (as you need to know the number of selected columns at parse time):
select * from
(
select companycode, system,
row_number() over (partition by id, country, companycode
order by sequence) as rn
from table_system
where id = (select max(id) from table_system)
)
pivot (max(system) for rn in (1 as system1, 2 as system2, 3 as system3,
4 as system4, 5 as system5, 6 as system6, 7 as system7, 8 as system8))
order by company code;
COMPANYCODE SYSTEM1 SYSTEM2 SYSTEM3 SYSTEM4 SYSTEM5 SYSTEM6 SYSTEM7 SYSTEM8
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
1001 Helsinki Branch GAP 2 / Datastage GL SAP GL SAP BW
1002 Copenhagen Branch GAP 2 / Datastage GL SAP GL SAP BW
1003 Oslo Branch GAP 2 / Datastage GL SAP GL SAP BW
1004 (publ) (EUR) EKO HA Core HA Post Processor Datastage GL Datastage Recon SAP GL SAP BW
So you'd need to establish the maximum number of systems you'll ever have present, and add clauses to the pivot (9 as system9, ...) to accommodate them all. The row_number() translates the sequence numbers into a contiguous number, so you don't have a big gap between the 5th and 6th systems for company 1004; apart from anything else you'd need the pivot to handle the maximum possible sequence number rather than the maximum count of systems.

Resources