BizTalk mapping RECADV D96A - biztalk
I need to convert the CSV file to EDIFACT RECADV D96A.
The input:
REC;A;ABC;120769;4502902610;0196466358;ABC;;202003051329;OB:505+DP:8718
RECD;1;110000;45;;
RECD;1;120000;50;;
RECD;1;130000;100;;
RECD;2;200000;21;;
RECD;2;210000;12;;
And the output should be:
LIN+1++1:EN'
GIN+BJ+110000:45'
GIN+BJ+120000:50'
GIN+BJ+130000:100'
LIN+2++2:EN'
GIN+BJ+200000:21'
GIN+BJ+210000:12'
This is what I managed to do until now:(take only first GIN for each line)
LIN+1++1:EN'
GIN+BJ+110000:45'
LIN+2++2:EN'
GIN+BJ+200000:21'
How can I do to take the number for each distinct line?
Thank you in advance!
Related
How to Concatenate multiple repetitive nodes into a single node - BizTalk
I have something like this in an input XML <OrderText> <text_type>0012</text_type> <text_content>Text1</text_content> </OrderText> <OrderText> <text_type>ZT03</text_type> <text_content>Text2</text_content> </OrderText> The above data I need to map after concatenating as the below schema <Order> <Note>0012:Text1#ZT03:Text2</Note> </Order> Can anyone please help?
I'm going to assume that your input actually has a Root node, as otherwise it is not valid XML. <Root> <OrderText> <text_type>0012</text_type> <text_content>Text1</text_content> </OrderText> <OrderText> <text_type>ZT03</text_type> <text_content>Text2</text_content> </OrderText> </Root> Then all you need is a map like this With a String Concatenate functoid with Input[0] = text_type Input[1] = : Input[2] = text_content Input[3] = # That goes into a Cumulative Concatenate This will give you an output of <Order> <Note>0012:Text1#ZT03:Text2#</Note> </Order> Note: There is a extra # at the end, but you could use some more functoids to trim that off if needed.
You can use the Value-Mapping Flattening functoid in a map, then feed the result of each into a Concatenate functoid to generate the result string. The map can be executed on a port or in an orchestration.
JSON format in R refuse to parse?
Here is my toy JSON: "[[143828095,86.82525,78.50037,0.011764707,1.0,1,1], [143828107,86.82525,78.50037,0.015686275,1.0,1,0], [143828174,84.82802,83.49646,0.015686275,1.0,1,0], [143828190,83.3301,92.4895,0.011764707,1.0,1,0], [143828206,83.3301,92.4895,0.011764707,1.0,1,-1], [143828251,119.482666,98.4848,0.03137255,1.0,2,1], [143828325,123.30899,95.93237,0.027450982,1.0,2,0], [143828334,128.47015,92.4895,0.027450982,1.0,2,0], [143828351,128.47015,92.4895,0.027450982,1.0,2,-1], [143828406,115.19141,60.514465,0.019607844,1.0,3,1], [143828529,121.183105,61.51367,0.019607844,1.0,3,0], [143828551,121.183105,61.51367,0.019607844,1.0,3,-1], [143828696,105.502075,94.26935,0.023529414,1.0,8,1], [143828773,105.502075,94.26935,0.023529414,1.0,8,-1], [143829030,78.24274,58.18811,0.023529414,1.0,DEL,1], [143829107,78.24274,58.18811,0.023529414,1.0,DEL,-1], [143831178,127.47159,76.28339,0.023529414,1.0,8,1], [143831244,127.47159,76.28339,0.023529414,1.0,8,-1]]" Now I want to parse it (fromJSON()) but DEL within the JSON prevents me to do this. Please advise how to fix it.
You can substitute "DEL" for, say, 0. json_string <- "[[143828095,86.82525,78.50037,0.011764707,1.0,1,1], [143828107,86.82525,78.50037,0.015686275,1.0,1,0], [143828174,84.82802,83.49646,0.015686275,1.0,1,0], [143828190,83.3301,92.4895,0.011764707,1.0,1,0], [143828206,83.3301,92.4895,0.011764707,1.0,1,-1], [143828251,119.482666,98.4848,0.03137255,1.0,2,1], [143828325,123.30899,95.93237,0.027450982,1.0,2,0], [143828334,128.47015,92.4895,0.027450982,1.0,2,0], [143828351,128.47015,92.4895,0.027450982,1.0,2,-1], [143828406,115.19141,60.514465,0.019607844,1.0,3,1], [143828529,121.183105,61.51367,0.019607844,1.0,3,0], [143828551,121.183105,61.51367,0.019607844,1.0,3,-1], [143828696,105.502075,94.26935,0.023529414,1.0,8,1], [143828773,105.502075,94.26935,0.023529414,1.0,8,-1], [143829030,78.24274,58.18811,0.023529414,1.0,DEL,1], [143829107,78.24274,58.18811,0.023529414,1.0,DEL,-1], [143831178,127.47159,76.28339,0.023529414,1.0,8,1], [143831244,127.47159,76.28339,0.023529414,1.0,8,-1]]" json_string <- gsub("DEL", 0, json_string) # You can make the zero any number you like fromJSON(json_string)
using a json parser (http://json.parser.online.fr/), just deleting the "DEL" at the respective places seems to fix the issue.
Matching the first and last charcters in a fasta file
I have a fasta sequences like following: fasta_sequences seq1_1 "MTFJKASDKASWQHBFDDFAHJKLDPAL" seq1_2 "GTRFKJDAIUETZUQOIHHASJKKJHPAL" seq1_3 "MTFJHAZOQIIREUUBSDFHGTRF" seq2_1 "JUZGFNBGTFCKAJDASEJIJAS" seq2_1 "MTFHJHJASBBCMASDOEQSDPAL" seq2_3 "RTZIIASDPLKLKLKLLJHGATRF" seq3_1 "HMTFLKBNCYXBASHDGWPQWKOP" seq3_2 "MTFJKASDJLKIOOIEOPWEIOKOP" I would like to retain only those sequences which starts with MTF and ends with either KOP or TRF or PAL. At the end it should be like seq1_1 "MTFJKASDKASWQHBFDDFAHJKLDPAL" seq1_3 "MTFJHAZOQIIREUUBSDFHGTRF" seq2_1 "MTFHJHJASBBCMASDOEQSDPAL" seq3_2 "MTFJKASDJLKIOOIEOPWEIOKOP" I tried the following code in R but it gave me which contains nothing new_fasta=grep("^MTF.*(PAL|TRF|KOP)$") Could anyone help how to get the desired output. Thanks in advance.
This is the way to go i guess; For every element in fasta_sequences; (if fasta_sequences is a vector containing the sequences) newseq = list() it=1 for (i in fasta_sequences){ # i is seq1_1, seq1_2 etc. a=substr(i,1,3) if (a=="MTF"){ x=substr(i,(nchar(i)-2),nchar(i)) if ( x=="PAL" | x=="KOP" | x=="TRF"){ newseq[it]=i it=it+1 } } } Hope it helps
new_fasta=grep("^MTF.*(PAL|TRF|KOP)$",fasta_sequences,perl=True) ^^^^^^^^^ Add perl=True option.
Merging a large number of csv datasets
Here are 2 sample datasets. PRISM-APPT_1895.csv https://copy.com/SOO2KbCHBX4MRQbn PRISM-APPT_1896.csv https://copy.com/JDytBqLgDvk6JzUe I have 100 of these types of data sets that I'm trying to merge into one data frame, export that to csv, and then merge that into another very large dataset. I need to merge everything by "gridNumber" and "Year", creating a time series dataset. Originally, I imported all of the annual datasets and then tried to merge them with this : df <- join_all(list(Year_1895, Year_1896, Year_1897, Year_1898, Year_1899, Year_1900, Year_1901, Year_1902, Year_1903, Year_1904, Year_1905, Year_1906, Year_1907, Year_1908, Year_1909, Year_1910, Year_1911, Year_1912, Year_1913, Year_1914, Year_1915, Year_1916, Year_1917, Year_1918, Year_1919, Year_1920, Year_1921, Year_1922, Year_1923, Year_1924, Year_1925, Year_1926, Year_1927, Year_1928, Year_1929, Year_1930, Year_1931, Year_1932, Year_1933, Year_1934, Year_1935, Year_1936, Year_1937, Year_1938, Year_1939, Year_1940, Year_1941, Year_1942, Year_1943, Year_1944, Year_1945, Year_1946, Year_1947, Year_1948, Year_1949, Year_1950, Year_1951, Year_1952, Year_1953, Year_1954, Year_1955, Year_1956, Year_1957, Year_1958, Year_1959, Year_1960, Year_1961, Year_1962, Year_1963, Year_1964, Year_1965, Year_1966, Year_1967, Year_1968, Year_1969, Year_1970, Year_1971, Year_1972, Year_1973, Year_1974, Year_1975, Year_1976, Year_1977, Year_1978, Year_1979, Year_1980, Year_1981, Year_1982, Year_1983, Year_1984, Year_1985, Year_1986, Year_1987, Year_1988, Year_1989, Year_1990, Year_1991, Year_1992, Year_1993, Year_1994, Year_1995, Year_1996, Year_1997, Year_1998, Year_1999, Year_2000), by = c("gridNumber","Year"),type="full") But R keeps crashing because I think the merge is a bit to large for it to handle, so I'm looking for something that would work better. Maybe data.table? Or another option. Thanks for any help you can provide.
Almost nine months later and your question has no answer. I could not find your datasets, however, I will show one way to do the job. It is trivial in awk. Here is a minimal awk script: BEGIN { for(i=0;i<10;i++) { filename = "out" i ".csv"; while(getline < filename) print $0; close(filename); } } The script is run as awk -f s.awk where s.awk is the above script in a text file. This script creates ten filenames: out0.csv, out1.csv ... out9.csv. These are the already-existing files with the data. The first file is opened and all records sent to the standard output. The file is then closed and the next filename created and opened. The above script has little to offer over a command line read/redirect. You would typically use awk to process a long list of filenames read from another file; with statements to selectively ignore lines or columns depending on various criteria.
Find oldest files in the directory based on the filename Timestamp in unix
I want the oldest files in the directory based on the file name date and timestamp to be listed first. Example: input file : AAAG11020709581.txt AAAG13020709581.txt AACL11020709581.txt AACL13020709581.txt AAFU11020709581.txt AAFU13020709581.txt AAHO11020709581.txt AAHO13020709581.txt AAPC11020709581.txt AAPC13020709581.txt AAPO11020709581.txt AAPO13020709581.txt AATR11020709581.txt AATR13020709581.txt AARC11020709581.txt AARC13020709581.txt Expected output : AAAG11020709581.txt AACL11020709581.txt AAFU11020709581.txt AAHO11020709581.txt AAPC11020709581.txt AAPO11020709581.txt AARC11020709581.txt AATR11020709581.txt AAAG13020709581.txt AACL13020709581.txt AAFU13020709581.txt AAHO13020709581.txt AAPC13020709581.txt AAPO13020709581.txt AARC13020709581.txt AATR13020709581.txt Can anyone please suggest ?
Sort will by default sort with the beginning of the line as key. You can tell it to start at a different place with the -k FIELD.OFFSET notation, e.g. if all filenames begin with 4 letters, you can skip these like this: sort -k1.5 Output: AAAG11020709581.txt AACL11020709581.txt AAFU11020709581.txt AAHO11020709581.txt AAPC11020709581.txt AAPO11020709581.txt AARC11020709581.txt AATR11020709581.txt AAAG13020709581.txt AACL13020709581.txt AAFU13020709581.txt AAHO13020709581.txt AAPC13020709581.txt AAPO13020709581.txt AARC13020709581.txt AATR13020709581.txt