How to remove BLANK spaces while converting HTML to TEXT using libxml2 HTMLParser

How to remove BLANK spaces while converting HTML to TEXT using libxml2 HTMLParser - libxml2

How to condense multiple white spaces to single/remove all white spaces here? see the following source code and example input.
void walkTree(xmlNode * a_node) {
xmlNode *cur_node = NULL;
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
if (cur_node->type == XML_TEXT_NODE) {
printf("%s", cur_node->content);
}
walkTree(cur_node->children);
}
}
void main () {
htmlParserCtxtPtr parser = htmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL, 0);
htmlCtxtUseOptions(parser, HTML_PARSE_NOBLANKS | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET);
len = ReadFile(data);
htmlParseChunk(parser, data, len, 0);
walkTree(xmlDocGetRootElement(parser->myDoc));
}
input: "<table><tbody><tr><td><b>Updated by:</b> </td><td >Test</td></tr></tbody></table><br/>"
output: "Updated by: Test"
Thanks.

Try to use this function:
str.erase(remove_if(str.begin(), str.end(), isspace), str.end());

Related

printf("%s",stringName) prints the wrong text but only once

I have a menu function, in which I input a question and two options, then the user choses one. It works just fine everytime but one ; I call
if (menu("ou est le corps?","interieur ","exterieur")==1)
{
but instead of printing "interieur " It shows "p?"
it works just fine without the space, but I need to make a space and \n does quite the same thing.
I have another call of this function, with \n which works fine so I have no idea about why this wouldn't work. Anyone has got an idea?
PS : the value of choix1 is then sent via bluetooth, and there it stays intact.
PPS : tell me if something is unclear, I'm not naturally english
PPPS(sorry) : tried to run the same code again, it seems to print a random character followed by "?", I had twice "p?", once "?" and once " '?"
[updates] once "#?"
int menu (String texte, String choix1, String choix2)
{
envoye = 0;
rxValue = "0";
while (digitalRead(M5_BUTTON_HOME) != LOW && rxValue == "0")
{
heure();
M5.Lcd.setTextSize(2);
M5.Lcd.print(texte);
M5.Lcd.printf("\n");
if (selec == 0)
{
M5.Lcd.printf("->%s %s", choix1, choix2);
}
else
{
M5.Lcd.printf(" %s ->%s", choix1, choix2);
}
if (M5.BtnB.read() != 0)
{
if (selec == 0)
{
selec = 1;
}
else
{
selec = 0;
}
while (M5.BtnB.read() != 0)
{
if(digitalRead(M5_BUTTON_HOME) == LOW)
{
M5.Lcd.fillScreen(BLACK);
delay(1000);
if(digitalRead(M5_BUTTON_HOME) == LOW)
{
choix=50;
heure();
delay(1000);
return 1;
}
}
}
}
if (deviceConnected && envoye == 0)
{
sendchoix(texte, choix1, choix2);
envoye++;
}
}
if (rxValue != "0")
{
recuble = &rxValue[0];
selec = atoi(recuble) - 1;
rxValue = "0";
}
M5.Lcd.fillScreen(BLACK);
delay(300);
return selec;
}

int menu (String texte, String choix1, String choix2) {
[...]
M5.Lcd.printf("->%s %s", choix1, choix2);
You cannot treat String objects as const char*, which is what the format specifier %s is expecting. String is an Arduino class for storing.. strings/character data, but an object of this class is not equivalent to the raw pointer to the data.
For that, you need to call the c_str() method on the String object to get the C-String pointer to the data, as shown in the documentation [1].
[..]
M5.Lcd.printf("->%s %s", choix1.c_str(), choix2.c_str());
[..]
[1] https://www.arduino.cc/reference/en/language/variables/data-types/string/functions/c_str/

Javacc error reporting results in “Expansion can be matched by empty string.”

I am trying to add some custom error messages to my javacc parser to hopefully make the error messages more specific and the language problems easier to find and correct.
The first error that I am trying to focus in on is how to detect that the correct number of arguments have been provided to a 'function' call. Rather than the default message, I would like to print out something like "missing argument to function".
My simplified language and my attempt to catch a missing argument error looks something like:
double arg(boolean allowMissing):
{ double v; Token t; }
{
t = <INT> { return Double.parseDouble(t.image); }
| t = <DOUBLE> { return Double.parseDouble(t.image); }
| v = functions() { return v; }
| { if (!allowMissing) throw new ParseException("Missing argument");} // #1 Throw error if missing argument
}
double functions() :
{ double v1, v2, result;
double[] array;
}
{
(<MIN> "(" v1=arg(false) "," v2=arg(false) ")") { return (v1<v2)?v1:v2; }
| (<MAX> "(" v1=arg(false) "," v2=arg(false) ")") { return (v1>v2)?v1:v2; }
| (<POW> "(" v1=arg(false) "," v2=arg(false) ")") { return Math.pow(v1, v2); }
| (<SUM> "(" array=argList() ")") { result=0; for (double v:array) result+=v; return result;}
}
double[] argList() :
{
ArrayList<Double> list = new ArrayList<>();
double v;
}
{
( (v=arg(true) { list.add(v);} ( "," v=arg(false) {list.add(v);} )*)?) { // #2 Expansion can be matched by empty string here
double[] arr = new double[list.size()];
for (int i=0; i<list.size(); i++)
arr[i] = list.get(i);
return arr;
}
}
As you can see functions will recursively resolve their arguments, and this allows function call to be nested.
Here are a few valid expressions that can be parsed in this language:
"min(1,2)",
"max(1,2)",
"max(pow(2,2),2)",
"sum(1,2,3,4,5)",
"sum()"
Here is an invalid expression:
"min()"
This all works well until I tried to check for missing arguments (code location #1). This works fine for the functions that have a fixed number of arguments. The problem is that the sum function (code location #2) is allowed to have zero arguments. I even passed in a flag to not throw an error if missing arguments are allowed. however, javacc gives me an error at location #2 that "Expansion within "(...)?" can be matched by empty string". I understand why I get this error. I have also read the answer for JavaCC custom errors cause "Expansion can be matched by empty string." but it did not help me.
My problem is that I just cannot see how I can have this both ways. I want to throw an error for missing arguments in the functions that have a fixed number of arguments, but I don't want an error in the function that allows no arguments. Is there a way to refactor my parser so that I still use the recursive style, catch missing arguments from the functions that take a fixed arguments, yet allow some functions to have zero arguments?
Or is there a better way to add in custom error messages? I am not really seeing much in the documentation.
Also, any pointers to examples that use more sophisticated error reporting would be greatly appreciated. I am actually using jjtree, but I simplified it down for this example.

Here's what I would do.
Instead of using a boolean argument in function arg, I would use the ? operator:
double arg():
{ double v; Token t; }
{
t = <INT> { return Double.parseDouble(t.image); }
| t = <DOUBLE> { return Double.parseDouble(t.image); }
| v = functions() { return v; }
}
double functions() :
{ double v1=0, v2=0, result;
double[] array;
}
{
(<MIN> "(" (v1=arg())? "," (v2=arg())? ")") { return (v1<v2)?v1:v2; }
| (<MAX> "(" (v1=arg())? "," (v2=arg())? ")") { return (v1>v2)?v1:v2; }
| (<POW> "(" (v1=arg())? "," (v2=arg())? ")") { return Math.pow(v1, v2); }
| (<SUM> "(" array=argList() ")") { result=0; for (double v:array) result+=v; return result;}
}
double[] argList() :
{
List<Double> list = new ArrayList<Double>();
double v;
}
{
( (v=arg() { list.add(v); } | { list.add(0.); } )
( "," (v=arg() { list.add(v); } | { list.add(0.); } ) )*) {
double[] arr = new double[list.size()];
for (int i=0; i<list.size(); i++)
arr[i] = list.get(i);
return arr;
}
}

You could do this
double arg():
{ double v; Token t; }
{
t = <INT> { return Double.parseDouble(t.image); }
| t = <DOUBLE> { return Double.parseDouble(t.image); }
| v = functions() { return v; }
}
double argRequired():
{ double v; }
{
v = arg() { return v ; }
| { if (!allowMissing) throw new ParseException("Missing argument");} // #1 Throw error if missing argument
}
double argOptional( double defaultValue ): // Not needed for this example, but might be useful.
{ double v; }
{
v = arg() { return v ; }
| { return defaultValue ; }
}
double functions() :
{ double v1, v2, result;
double[] array;
}
{
(<MIN> "(" v1=argRequired() "," v2=argRequired() ")") { return (v1<v2)?v1:v2; }
| (<MAX> "(" v1=argRequired() "," v2=argRequired() ")") { return (v1>v2)?v1:v2; }
| (<POW> "(" v1=argRequired() "," v2=argRequired() ")") { return Math.pow(v1, v2); }
| (<SUM> "(" array=argList() ")") { result=0; for (double v:array) result+=v; return result;}
}
double[] argList( ) :
{
ArrayList<Double> list = new ArrayList<>();
double v;
}
{
( v=arg() { list.add(v);}
( "," v=argRequired() {list.add(v);}
)*
)?
{
double[] arr = new double[list.size()];
for (int i=0; i<list.size(); i++)
arr[i] = list.get(i);
return arr;
}
}

directions for use javacc token

I want to distinguish multiple tokens.
Look at my code.
TOKEN :
{
< LOOPS :
< BEAT >
| < BASS >
| < MELODY >
>
| < #BEAT : "beat" >
| < #BASS : "bass" >
| < #MELODY : "melody" >
}
void findType():
{Token loops;}
{
loops = < LOOPS >
{ String type = loops.image; }
I want to use the findType () function to find the type.
How can I get return the correct output when the input is "beat"?

What you want to do is to add a return statement, like this:
String findType():
{Token loops;}
{
loops = < LOOPS >
{
String type = loops.image;
return type;
}
}
Have in mind you have changed return value definition in the method, from void to String.
Then, from your main:
ExampleGrammar parser = new ExampleGrammar(System.in);
while (true)
{
System.out.println("Reading from standard input...");
System.out.print("Enter loop:");
try
{
String type = ExampleGrammar.findType();
System.out.println("Type is: " + type);
}
catch (Exception e)
{
System.out.println("NOK.");
System.out.println(e.getMessage());
ExampleGrammar.ReInit(System.in);
}
catch (Error e)
{
System.out.println("Oops.");
System.out.println(e.getMessage());
break;
}
}
It generates an output like:
Reading from standard input...
Enter loop:bass
Type is: bass
Reading from standard input...
Enter loop:beat
Type is: beat

Is it possible to use a Kleene Operator for Flex Formatters?

is it possible to use a Kleene Operator (Kleene Star) for the Formatters?
I want to use a phoneFormatter, which puts a minus after the 5th number and afterwards it should be possible to have a variable number of numbers.
E.g.: 0172-555666999, 0160-44552 etc.
That is how I started, but I don't know which character belongs after the last hash (it is not a star, I already tried it ;-) ):
<fx:Declarations>
<mx:PhoneFormatter id="mPhoneFormat"
formatString="####-#"/>
</fx:Declarations>

The default PhoneFormatter expects the input string to have the same number of characters as the format string. They don't support regular expression patterns (like * to match the element zero or more times).
However, it's pretty easy to make your own formatter. To do this, I extended the PhoneFormatter class and overrode its format() method. I copied and pasted the original format() method and made the following modifications:
comment out the code that compared the length of the source string with the length of the format string
compare the length of the formatted string. If the original string is longer, append the remaining chars from the original string to the formatted string.
This probably won't handle all of your use cases, but it should be pretty straightforward to modify this to your needs.
package
{
import mx.formatters.PhoneFormatter;
import mx.formatters.SwitchSymbolFormatter;
public class CustomPhoneNumberFormatter extends PhoneFormatter
{
public function CustomPhoneNumberFormatter()
{
super();
}
override public function format(value:Object):String
{
// Reset any previous errors.
if (error)
error = null;
// --value--
if (!value || String(value).length == 0 || isNaN(Number(value)))
{
error = defaultInvalidValueError;
return "";
}
// --length--
var fStrLen:int = 0;
var letter:String;
var n:int;
var i:int;
n = formatString.length;
for (i = 0; i < n; i++)
{
letter = formatString.charAt(i);
if (letter == "#")
{
fStrLen++;
}
else if (validPatternChars.indexOf(letter) == -1)
{
error = defaultInvalidFormatError;
return "";
}
}
// if (String(value).length != fStrLen)
// {
// error = defaultInvalidValueError;
// return "";
// }
// --format--
var fStr:String = formatString;
if (fStrLen == 7 && areaCode != -1)
{
var aCodeLen:int = 0;
n = areaCodeFormat.length;
for (i = 0; i < n; i++)
{
if (areaCodeFormat.charAt(i) == "#")
aCodeLen++;
}
if (aCodeLen == 3 && String(areaCode).length == 3)
{
fStr = String(areaCodeFormat).concat(fStr);
value = String(areaCode).concat(value);
}
}
var dataFormatter:SwitchSymbolFormatter = new SwitchSymbolFormatter();
var source:String = String(value);
var returnValue:String = dataFormatter.formatValue(fStr, value);
if (source.length > returnValue.length)
{
returnValue = returnValue + source.substr(returnValue.length-1);
}
return returnValue;
}
}
}

No more than 9 digits after separator with spark.formatters.NumberFormatter?

I must display very small values (capacitor) in a Flex-AdvancedDataGrid
I use spark.formatters.NumberFormatter.
If I use 3, 6 or 9 for fractionalDigits, everything is fine.
But if I use 12, because I need 12 digits after decimal separator, then the value is cut after 9 digits!
Is there a way to get more then 9 digits after separator.
Or is there a way to use a formatting like "4.7 E-12" (Must be E-9, E-12, E-15 and so on)

toPrecision and toFixed works fine up to 20 digits. Thats enough.
I will write a function on this base to get results like 4.7 E-12.
Thanks for the help
Jan

I created a custom pattern formatter class that allows you to specify how you want a string/number formatted using whatever symbols or structure you want. Depending on your requirements, this may help you. Feel free to modify this class as needed.
CustomPatternFormatter.as
package
{
import mx.formatters.Formatter;
public class CustomPatternFormatter extends Formatter
{
private static const VALID_PATTERN_CHARACTERS:String = "#,.-";
// Use # as a placeholder for a regular
// character in the input string.
// Then add other characters between the
// # symbol for the desired output format.
// ex. The pattern ##,##,##.## with input 12345678
// will output 12,34,56.78
public var formatPattern:String;
// If True, the input string must match the number
// of # characters in the formatPattern.
public var inputMustMatchPatternLength:Boolean;
//Constructor
public function CustomPatternFormatter()
{
super();
formatPattern = "";
inputMustMatchPatternLength = false;
}
// Override format().
override public function format(value:Object):String
{
// Reset error if it exists.
if (error)
error = null;
// If value is null, or empty String just return ""
// but treat it as an error for consistency.
// Users will ignore it anyway.
if (!value || (value is String && value == ""))
{
error = "Cannot convert an empty value";
return "";
}
// Check to see if the input value must match the format pattern
if (inputMustMatchPatternLength && String(value).length != countOccurrences(formatPattern, "#"))
{
error = "The input value length does not match the format pattern length.";
return "";
}
// If the value is valid, format the string.
var fStrLen:int = 0;
var letter:String;
var n:int;
var i:int;
var v:int;
// Make sure the formatString is valid.
n = formatPattern.length;
for (i = 0; i < n; i++)
{
letter = formatPattern.charAt(i);
if (letter == "#")
{
fStrLen++;
}
else if (VALID_PATTERN_CHARACTERS.indexOf(letter) == -1)
{
error = "You can only use the following symbols in the formatPattern: " + VALID_PATTERN_CHARACTERS;
return "";
}
}
var returnString:String = "";
var vStr:String = String(value).replace(".", "").split("").reverse().join("");
var fArr:Array = formatPattern.split("").reverse();
var fChar:String;
// Format the string
for (v = 0; v < vStr.length; v++) { if (fArr.length > 0)
{
do
{
fChar = fArr.shift();
if (fChar != "#")
returnString += fChar;
} while (fChar != "#" && fArr.length > 0);
}
returnString += vStr.charAt(v);
}
// Return the formatted string
return returnString.split("").reverse().join("");
}
protected function countOccurrences(str:String, char:String):int
{
var count:int = 0;
for (var i:int=0; i < str.length; i++)
{
if (str.charAt(i) == char)
{
count++;
}
}
return count;
}
}
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to remove BLANK spaces while converting HTML to TEXT using libxml2 HTMLParser - libxml2

Try to use this function: str.erase(remove_if(str.begin(), str.end(), isspace), str.end());

Related

printf("%s",stringName) prints the wrong text but only once

Javacc error reporting results in “Expansion can be matched by empty string.”

directions for use javacc token

Is it possible to use a Kleene Operator for Flex Formatters?

No more than 9 digits after separator with spark.formatters.NumberFormatter?

Categories

Resources