I'm trying to convert an xml document into a specific tab separated flat file structure. Most of the elements can be mapped to single columns or concatenated simply using fn:string-join(), but I have some elements where the mapping is more complicated. An example element looks like this:
<record>
<details>
<passports>
<passport country="">0018061/104</passport>
<passport country="UK">0354761445</passport>
<passport country="USA">M001806145</passport>
</passports>
</details>
<record>
and I need to create a column that looks like this:
0018061/104;(UK) 0354761445;(USA) M001806145
so if the #country attribute is not "" it is put in (), otherwise it is omitted. The element value follows and each element is separated by ;.
Here's what I have done so far:
for $record in //record
return concat($record/#uid/string(),
(: ... other columns ... :)
" ", <S>{for $r in //$record/details/passports/passport
return concat("(", $r/#country, ") ", $r, ";")}</S>/string()
,"
")
I'm sure there's an easier way, but this almost does the job - it produces:
() 0018061/104;(UK) 0354761445;(USA) M001806145
Ideally I'd like to know the correct way to do this, otherwise just removing the empty brackets where #country="" would suffice.
Use an if clause right in the outer concat (I added some newlines for better readability in the answer, you can of course remove them as you wish):
concat(
if ($r/#country != "")
then concat("(", $r/#country, ") ")
else "",
$r,
";"
)
New result of the query:
0018061/104; (UK) 0354761445; (USA) M001806145;
You could also go for an implicit loop
/record/details/passports/passport/string-join(
(
" ",
if (#country != "")
then "(" || #country || ") "
else (),
.
), ""
)
or explicitly loop over the results and still have a cleaner query (by replacing the concatenation operator || by respective concat(...) calls, you would stay XQuery 1.0 compatible):
for $record in /record/details/passports/passport
return (
" " || (
if ($record/#country != "")
then "(" || $record/#country || ") "
else ()
) || $record
)
Both cases use the implicit newlines inserted by BaseX in-between tokens, alternatively you can of course add them as you had before.
Related
I create a function and need to paste " " around the string, the final desired code is 'table_df' in the following code
if (exists('table_df') && is.data.frame(get('table_df'))&nrow(table_df)>0) {
tracking_sheet$var1[tracking_sheet$var2=="table_name"]<-'Completed'
} else {tracking_sheet$var1[tracking_sheet$var2=="table_name"]<-'Check'}
this is my function, but it doesnt work, mainly because of the quotes around the string part. paste('", table_df, "',sep=""), so my question is how to use paste or other function to achieve the final result 'table_df'
check<-defmacro(tracking_sheet,table_df,table_name,
expr={if (exists(paste('", table_df, "',sep="")) && is.data.frame(get(paste('", table_df, "',sep="")))&nrow(table_df)>0) {
tracking_sheet$var1[tracking_sheet$var2==table_name]<-'Completed'
} else {tracking_sheet$var1[tracking_sheet$var2==table_name]<-'Check'}
})
check(tracking_sheet,app_df_pivot,"T_Applications")
the code above is trying to create a summary sheet to report which dataframe is existed in the environment and if the df contains data. I am welcome to all advice and thank you!
I think you mean to use
paste('"', table_df, '"',sep="")
Without the closing single-quotes,
paste('", table_df, "',sep="")
evaluates to ", table_df, "
If you want to paste a quote, you have to escape it with "\"
paste0("\'", "example", "\'")
I need to search those elements who have space " " in their attributes.
For example:
<unit href="http:xxxx/unit/2 ">
Suppose above code have space in the last for href attribute.
I have done this using FLOWER query. But I need this to be done using CTS functions. Please suggest.
For FLOWER query I have tried this:
let $x := (
for $d in doc()
order by $d//id
return
for $attribute in data($d//#href)
return
if (fn:contains($attribute," ")) then
<td>{(concat( "id = " , $d//id) ,", data =", $attribute)}</td>
else ()
)
return <tr>{$x}</tr>
This is working fine.
For CTS I have tried
let $query :=
cts:element-attribute-value-query(xs:QName("methodology"),
xs:QName("href"),
xs:string(" "),
"wildcarded")
let $search := cts:search(doc(), $query)
return fn:count($search)
Your query is looking for " " to be the entirety of the value of the attribute. If you want to look for attributes that contain a space, then you need to use wildcards. However, since there is no indexing of whitespace except for exact value queries (which are by definition not wildcarded), you are not going to get a lot of index support for that query, so you'll need to run this as a filtered search (which you have in your code above) with a lot of false positives.
You may be better off creating a string range index on the attribute and doing value-match on that.
How can I make Java print "Hello"?
When I type System.out.print("Hello"); the output will be Hello. What I am looking for is "Hello" with the quotes("").
System.out.print("\"Hello\"");
The double quote character has to be escaped with a backslash in a Java string literal. Other characters that need special treatment include:
Carriage return and newline: "\r" and "\n"
Backslash: "\\"
Single quote: "\'"
Horizontal tab and form feed: "\t" and "\f"
The complete list of Java string and character literal escapes may be found in the section 3.10.6 of the JLS.
It is also worth noting that you can include arbitrary Unicode characters in your source code using Unicode escape sequences of the form \uxxxx where the xs are hexadecimal digits. However, these are different from ordinary string and character escapes in that you can use them anywhere in a Java program ... not just in string and character literals; see JLS sections 3.1, 3.2 and 3.3 for a details on the use of Unicode in Java source code.
See also:
The Oracle Java Tutorial: Numbers and Strings - Characters
In Java, is there a way to write a string literal without having to escape quotes? (Answer: No)
char ch='"';
System.out.println(ch + "String" + ch);
Or
System.out.println('"' + "ASHISH" + '"');
Escape double-quotes in your string: "\"Hello\""
More on the topic (check 'Escape Sequences' part)
You can do it using a unicode character also
System.out.print('\u0022' + "Hello" + '\u0022');
Adding the actual quote characters is only a tiny fraction of the problem; once you have done that, you are likely to face the real problem: what happens if the string already contains quotes, or line feeds, or other unprintable characters?
The following method will take care of everything:
public static String escapeForJava( String value, boolean quote )
{
StringBuilder builder = new StringBuilder();
if( quote )
builder.append( "\"" );
for( char c : value.toCharArray() )
{
if( c == '\'' )
builder.append( "\\'" );
else if ( c == '\"' )
builder.append( "\\\"" );
else if( c == '\r' )
builder.append( "\\r" );
else if( c == '\n' )
builder.append( "\\n" );
else if( c == '\t' )
builder.append( "\\t" );
else if( c < 32 || c >= 127 )
builder.append( String.format( "\\u%04x", (int)c ) );
else
builder.append( c );
}
if( quote )
builder.append( "\"" );
return builder.toString();
}
System.out.println("\"Hello\"");
System.out.println("\"Hello\"")
There are two easy methods:
Use backslash \ before double quotes.
Use two single quotes instead of double quotes like '' instead of "
For example:
System.out.println("\"Hello\"");
System.out.println("''Hello''");
Take note, there are a few certain things to take note when running backslashes with specific characters.
System.out.println("Hello\\\");
The output above will be:
Hello\
System.out.println(" Hello\" ");
The output above will be:
Hello"
Use Escape sequence.
\"Hello\"
This will print "Hello".
you can use json serialization utils to quote a java String.
like this:
public class Test{
public static String quote(String a){
return JSON.toJsonString(a)
}
}
if input is:hello output will be: "hello"
if you want to implement the function by self:
it maybe like this:
public static String quotes(String origin) {
// 所有的 \ -> \\ 用正则表达为: \\ => \\\\" 再用双引号quote起来: \\\\ ==> \\\\\\\\"
origin = origin.replaceAll("\\\\", "\\\\\\\\");
// " -> \" regExt: \" => \\\" quote to param: \\\" ==> \\\\\\\"
origin = origin.replaceAll("\"", "\\\\\\\"");
// carriage return: -> \n \\\n
origin = origin.replaceAll("\\n", "\\\\\\n");
// tab -> \t
origin = origin.replaceAll("\\t", "\\\\\\t");
return origin;
}
the above implementation will quote escape character in string but exclude
the " at the start and end.
the above implementation is incomplete. if other escape character you need , you can add to it.
I am extracting data from an XML file and I need to extract a delimited list of sub-elements. I have the following:
for $record in //record
let $person := $record/person/names
return concat($record/#uid/string()
,",", $record/#category/string()
,",", $person/first_name
,",", $person/last_name
,",", $record/details/citizenships
,"
")
The element "citizenships" contains sub-elements called "citizenship" and as the query stands it sticks them all together in one string, e.g. "UKFrance". I need to keep them in one string but separate them, e.g. "UK|France".
Thanks in advance for any help!
fn:string-join($arg1 as xs:string*, $arg2 as xs:string) is what you're looking for here.
In your currently desired usage, that would look something like the following:
fn:string-join($record/details/citizenships/citizenship, "|")
Testing outside your document, with:
fn:string-join(("UK", "France"), "|")
...returns:
UK|France
Notably, ("UK", "France") is a sequence of strings, just as a query returning multiple citizenships would likewise be a sequence (the entries in which will be evaluated for their string value when passed to fn:string-join(), which is typed as taking a sequence of strings for its first argument).
Consider the following (simplified) query:
declare context item := document { <root>
<record uid="1">
<person>
<citizenships>
<citizenship>France</citizenship>
<citizenship>UK</citizenship>
</citizenships>
</person>
</record>
</root> };
for $record in //record
return concat(fn:string-join($record//citizenship, "|"), "
")
...and its output:
France|UK
I have a folder with files containing some text. I am trying to go through all the files one after the other, and count how many times we see every word in the text files.
I know how to open the file, but once I'm in the file I don't know how to read each word one after the other, and go to the next word.
If anyone has some ideas to guide me it'd be great.
Read the file a line at a time into a string using Get_Line, then break the line up into the individual words.
Here's one way of doing it, I needed some playtime with the containers.
Using streams is still the best solution for your problem, given the multiple files.
Text_Search.ads
Pragma Ada_2012;
With
Ada.Containers.Indefinite_Ordered_Maps;
Package Text_Search with Elaborate_Body is
Text : Constant String :=
ASCII.HT &
"We hold these truths to be self-evident, that all men are created " &
"equal, that they are endowed by their Creator with certain unalienable "&
"Rights, that among these are Life, Liberty and the pursuit of " &
"Happiness.--That to secure these rights, Governments are instituted " &
"among Men, deriving their just powers from the consent of the governed" &
", --That whenever any Form of Government becomes destructive of these " &
"ends, it is the Right of the People to alter or to abolish it, and to " &
"institute new Government, laying its foundation on such principles " &
"and organizing its powers in such form, as to them shall seem most " &
"likely to effect their Safety and Happiness. Prudence, indeed, will " &
"dictate that Governments long established should not be changed for " &
"light and transient causes; and accordingly all experience hath shewn, "&
"that mankind are more disposed to suffer, while evils are sufferable, " &
"than to right themselves by abolishing the forms to which they are " &
"accustomed. But when a long train of abuses and usurpations, pursuing " &
"invariably the same Object evinces a design to reduce them under " &
"absolute Despotism, it is their right, it is their duty, to throw off " &
"such Government, and to provide new Guards for their future security." &
"now the necessity which constrains them to alter their former Systems " &
"of Government. The history of the present King of Great Britain is a " &
"history of repeated injuries and usurpations, all having in direct " &
"object the establishment of an absolute Tyranny over these States. To " &
"prove this, let Facts be submitted to a candid world.";
Package Word_List is New Ada.Containers.Indefinite_Ordered_Maps(
Key_Type => String,
Element_Type => Positive
);
Function Create_Map( Words : String ) Return Word_List.Map;
Words : Word_List.map;
End Text_Search;
Text_Search.adb
Package Body Text_Search is
Function Create_Map( Words : String ) Return Word_List.Map is
Delimiters : Array (Character) of Boolean:=
('.' | ' ' | '-' | ',' | ';' | ASCII.HT => True, Others => False);
Index, Start, Stop : Positive := Words'First;
begin
Return Result : Word_List.Map do
Parse:
loop
Start:= Index;
-- Ignore initial delimeters.
while Delimiters(Words(Start)) loop
Start:= 1+Start;
end loop;
Stop:= Start;
while not Delimiters(Words(Stop)) loop
Stop:= 1+Stop;
end loop;
declare
-- Because we stop *on* a delimiter we mustn't include it.
Subtype R is Positive Range Start..Stop-1;
Substring : String renames Words(R);
begin
-- if it's there, increment; otherwise add it.
if Result.Contains( Substring ) then
Result(Substring):= 1 + Result(Substring);
else
Result.Include( Key => substring, New_Item => 1 );
end if;
end;
Index:= Stop + 1;
end loop parse;
exception
When Constraint_Error => null; -- we run until our index fails.
end return;
End Create_Map;
Begin
Words:= Create_Map( Words => Text );
End Text_Search;
Test.adb
Pragma Ada_2012;
Pragma Assertion_Policy( Check );
With
Text_Search,
Ada.Text_IO;
Procedure Test is
Procedure Print_Word( Item : Text_Search.Word_List.Cursor ) is
use Text_Search.Word_List;
Word : String renames Key(Item);
Word_Column : String(1..20) := (others => ' ');
begin
Word_Column(1..Word'Length+1):= Word & ':';
Ada.Text_IO.Put_Line( Word_Column & Positive'Image(Element(Item)) );
End Print_Word;
Begin
Text_Search.Words.Iterate( Print_Word'Access );
End Test;
Instead of going by individual words, you could read the file a line at a time into a string using Get_Line, and then use regular expressions: Regular Expressions in Ada?
If you're using Ada 2012 here's how I would recommend doing it:
With Ada.Containers.Indefinite_Ordered_Maps.
Instantiate a Map with String as key, and Positive as Key;
Grab the string; I'd use either a single string or stream-processing.
Break the input-text into words; this can be done on-the-fly if you use streams.
When you get a word (from #4) add it to your map if it doesn't exist, otherwise increment the element.
When you are finished, just run a For Element of WORD_MAP Loop that prints the string & count.
There's several ways you could handle strings in #3:
Perfectly sized via recursive function call [terminating on a non-word character or the end of input].
Unbounded_String
Vector (Positive, Character) -- append for valid characters, convert to array [string] and add to the map when an invalid-character [or the end of input] is encountered -- working variable.
Not Null Access String working variable.