Add Header to an XML to CSV conversion using XQuery - xquery

Hi I am trying to convert some xml to csv using xquery and found a previous post that helped me get to this point:
for $b in /root/Result
return
concat(escape-html-uri(string-join(($b/HolidayEndDate,
$b/HolidayType,
$b/FirstName,
$b/AllowanceRemainingDays,
$b/HolidayStartDate,
$b/EmployeeId,
$b/AllowanceDays,
$b/LastName,
$b/HolidayDurationDays
)
/normalize-space(),
",")
),
codepoints-to-string(10))
This returns all of the data as required but no Header row. Is there a simple addition to the above code that would also return the header row? Thanks. :)

Since your query returns a sequence of lines, you can just prepend another line before the FLWOR expression:
"HolidayEndDate,HolidayType,FirstName,AllowanceRemainingDays,HolidayStartDate,EmployeeId,AllowanceDays,LastName,HolidayDurationDays
",
for $b in /root/Result
return
concat(escape-html-uri(string-join(($b/HolidayEndDate,
$b/HolidayType,
$b/FirstName,
$b/AllowanceRemainingDays,
$b/HolidayStartDate,
$b/EmployeeId,
$b/AllowanceDays,
$b/LastName,
$b/HolidayDurationDays
)
/normalize-space(),
",")
),
codepoints-to-string(10))
Because nested sequences are flattened (i.e. concatenated) in XQuery, this results in one output sequence including the header. Note also that I used a character entity '
' for the newline character, which is much shorter than codepoints-to-string(10).

concat("HolidayEndDate,HolidayType,FirstName,AllowanceRemainingDays,HolidayStartDate,EmployeeId,AllowanceDays,LastName,HolidayDurationDays
",
string-join(
for $b in /root/Result
return
concat(escape-html-uri(string-join(($b/HolidayEndDate,
$b/HolidayType,
$b/FirstName,
$b/AllowanceRemainingDays,
$b/HolidayStartDate,
$b/EmployeeId,
$b/AllowanceDays,
$b/LastName,
$b/HolidayDurationDays
)
/normalize-space(),
",")
),
codepoints-to-string(10)), "")
)

Related

Insert characters when a string changes its case R

I would like to insert characters in the places were a string change its case. I tried this to insert a '\n' after a fixed number of characters and then a ' ', as I don't figure out how to detect the case change
s <-c("FloridaIslandE7", "FloridaIslandE9", "Meta")
gsub('^(.{7})(.{6})(.*)$', '\\1\\\n\\2 \\3', s )
[1] "Florida\nIsland E7" "Florida\nIsland E9" "Meta"
This works because the positions are fixed but I would like to know how to do it for the general case.
Surely there's a less convoluted regex for this, but you could try:
gsub('([A-Z][0-9])', ' \\1', gsub('([a-z])([A-Z])', '\\1\n\\2', s))
Output:
[1] "Florida\nIsland E7" "Florida\nIsland E9" "Meta"
Here is an option
str_replace_all(s, "(?<=[a-z])(?=[A-Z])", "\n")
#[1] "Florida\nIsland\nE7" "Florida\nIsland\nE9" "Meta"
If you really want to insert \n, try this:
gsub("([a-z])([A-Z])", "\\1\\\n\\2", s)
[1] "Florida\nIsland\nE7" "Florida\nIsland\nE9" "Meta"

Remove space in print statement in python

While using the below print command:
print(k,':',dict[k])
I get the output as shown below but in the output, i want to remove the space between the key and colon.How to do it?
Current Output:
Sam : 40
Required Output:
Sam: 40
You could try printing a single string consisting of a concatenation:
print(k + ': ' + dict[k])
The python print() statement has a separator parameter that defaults to a space. So the comma-separated values that you are passing into it serve as arguments each of which will get separated by white-space while printing.
I think what you are looking for is
print(name, ": ", "40", sep = '')
>>> Sam: 40
Simply specifying the "sep" parameter solves your issue.

XQuery Type of value does not match

declare variable $fb := doc("factbook.xml")/mondial;
for $c in $fb//country
where ($c/encompassed/#continent = 'f0_119') and ($c/#population < 100000)
return concat('Country: ',$c/name, ', Population: ',$c/#population);
it returns:
Type Error: Type of value '
()
' does not match sequence type: xs:anyAtomicType?
At characters 11681-11698
At File "q2_3.xq", line 4, characters 13-67
At File "q2_3.xq", line 4, characters 13-67
At File "q2_3.xq", line 4, characters 13-67
however, if i do not do a concat return, just name or population it will work, and most strange thing is i have another program :
declare variable $fb := doc("factbook.xml")/mondial;
for $c in $fb//country
where $c/religions = 'Seventh-Day Adventist'
order by $c/name
return concat('Country: ',$c/name, ', Population: ',$c/#population);
The return syntax is exactly same, however, it works.
Why this happens?
Without seeing an example of your data it's impossible to say for sure, but if $c/name returns more than one value, then your error would make sense. Do you have any results where there are more than one name element?

String recognition in idl

I have the following strings:
F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_East_A.dat
F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_Froemke-Hoy.dat
and from each I want to extract the three variables, 1. SWIR32 2. the date and 3. the text following the date. I want to automate this process for about 200 files, so individually selecting the locations won't exactly work for me.
so I want:
variable1=SWIR32
variable2=2005210
variable3=East_A
variable4=SWIR32
variable5=2005210
variable6=Froemke-Hoy
I am going to be using these to add titles to graphs later on, but since the position of the text in each string varies I am unsure how to do this using strmid
I think you want to use a combination of STRPOS and STRSPLIT. Something like the following:
s = ['F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_East_A.dat', $
'F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_Froemke-Hoy.dat']
name = STRARR(s.length)
date = name
txt = name
foreach sub, s, i do begin
sub = STRMID(sub, 1+STRPOS(sub, '\', /REVERSE_SEARCH))
parts = STRSPLIT(sub, '_', /EXTRACT)
name[i] = parts[0]
date[i] = parts[1]
txt[i] = STRJOIN(parts[2:*], '_')
endforeach
You could also do this with a regular expression (using just STRSPLIT) but regular expressions tend to be complicated and error prone.
Hope this helps!

grep on two strings

I'm working to grab two different elements in a string.
The string look like this,
str <- c('a_abc', 'b_abc', 'abc', 'z_zxy', 'x_zxy', 'zxy')
I have tried with the different options in ?grep, but I can't get it right, 'm doing something like this,
grep('[_abc]:[_zxy]',str, value = TRUE)
and what I would like is,
[1] "a_abc" "b_abc" "z_zxy" "x_zxy"
any help would be appreciated.
Use normal parentheses (, not the square brackets [
grep('_(abc|zxy)',str, value = TRUE)
[1] "a_abc" "b_abc" "z_zxy" "x_zxy"
To make the grep a bit more flexible, you could do something like:
grep('_.{3}$',str, value = TRUE)
Which will match an underscore _ followed by any character . three times {3} followed immediately by the end of the string $
this should work: grep('_abc|_zxy', str, value=T)
X|Y matches when either X matches or Y matches
In this case just doing:
str[grep("_",str)]
will work... is it more complicated in your specific case?

Resources