Date Time Parser using YACC shift reduce conflicts - datetime

I have the following YACC parser
%start Start
%token _DTP_LONG // Any number; Max upto 4 Digits.
%token _DTP_SDF // 17 Digit number indicating SDF format of Date Time
%token _DTP_EOS // end of input
%token _DTP_MONTH //Month names e.g Jan,Feb
%token _DTP_AM //Is A.M
%token _DTP_PM //Is P.M
%%
Start : DateTimeShortExpr
| DateTimeLongExpr
| SDFDateTimeExpr EOS
| DateShortExpr EOS
| DateLongExpr EOS
| MonthExpr EOS
;
DateTimeShortExpr : DateShortExpr TimeExpr EOS {;}
| DateShortExpr AMPMTimeExpr EOS {;}
;
DateTimeLongExpr : DateLongExpr TimeExpr EOS {;}
| DateLongExpr AMPMTimeExpr EOS {;}
;
DateShortExpr : Number { rc = vDateTime.SetDate ((Word) $1, 0, 0);
}
| Number Number { rc = vDateTime.SetDate ((Word) $1, (Word) $2, 0); }
| Number Number Number { rc = vDateTime.SetDate ((Word) $1, (Word) $2, (Word) $3); }
;
DateLongExpr : Number AbsMonth { // case : number greater than 31, consider as year
if ($1 > 31) {
rc = vDateTime.SetDateFunc (1, (Word) $2, (Word) $1);
}
// Number is considered as days
else {
rc = vDateTime.SetDateFunc ((Word) $1, (Word) $2, 0);
}
}
| Number AbsMonth Number {rc = vDateTime.SetDateFunc((Word) $1, (Word) $2, (Word) $3);}
;
TimeExpr : Number { rc = vDateTime.SetTime ((Word) $1, 0, 0);}
| Number Number { rc = vDateTime.SetTime ((Word) $1, (Word) $2, 0); }
| Number Number Number { rc = vDateTime.SetTime ((Word) $1, (Word) $2, (Word) $3); }
;
AMPMTimeExpr : TimeExpr _DTP_AM { rc = vDateTime.SetTo24hr(TP_AM) ; }
| TimeExpr _DTP_PM { rc = vDateTime.SetTo24hr(TP_PM) ; }
| _DTP_AM TimeExpr { rc = vDateTime.SetTo24hr(TP_AM) ; }
| _DTP_PM TimeExpr { rc = vDateTime.SetTo24hr(TP_PM) ; }
;
SDFDateTimeExpr : SDFNumber { rc = vDateTime.SetSDF ($1);}
;
MonthExpr : AbsMonth { rc = vDateTime.SetNrmMth ($1);}
| AbsMonth Number { rc = vDateTime.Set ($1,$2);}
;
Number : _DTP_LONG { $$ = $1; }
;
SDFNumber : _DTP_SDF { $$ = $1; }
;
EOS : _DTP_EOS { $$ = $1; }
;
AbsMonth : _DTP_MONTH { $$ = $1; }
;
%%
It is giving three shift reduce conflicts.How can i remove them????

The shift-reduce conflicts are inherent in the "little language" that your grammar describes. Consider the stream of input tokens
_DTP_LONG _DTP_LONG _DTP_LONG EOS
Each _DTP_LONG can be reduced as a Number. But should
Number Number Number
be reduced as a 1-number DateShortExpr followed by a 2-number TimeExpr or as a 2-number DateShortExpr followed by a 1-number TimeShortExpr? The ambiguity is built in.
If possible, redesign your language by adding additional symbols to distinguish dates from times--colons to set off the parts of a time and slashes to set off the parts of a date, for instance.
Update
I don't think that you can use yacc/bison's precedence features here, because the tokens are indistinguishable.
You will have to rely on yacc/bison's default behavior when it encounters a shift/reduce conflict, that is, to shift rather than reduce. Consider this example in your output:
+------------------------- STATE 9 -------------------------+
+ CONFLICTS:
? sft/red (shift & new state 12, rule 11) on _DTP_LONG
+ RULES:
DateShortExpr : Number^ (rule 11)
DateShortExpr : Number^Number
DateShortExpr : Number^Number Number
DateLongExpr : Number^AbsMonth
DateLongExpr : Number^AbsMonth Number
+ ACTIONS AND GOTOS:
_DTP_LONG : shift & new state 12
_DTP_MONTH : shift & new state 13
: reduce by rule 11
Number : goto state 26
AbsMonth : goto state 27
What the parser will do is to shift and apply rule 12, rather than reduce by rule 11 (DateShortExpr : Number). This means the parser will never interpret a single Number as a DateShortExpr; it will always shift.
And a difficulty with relying on the default behavior is that it might change as you make modifications to your grammar.

Related

Lex & Yacc AST homework

I need to write this function in AST, preorder, but when I run my yacc file, it prints "Segmentatio fault(core dumped)". If you can please help me resolve my problem, because it as been a few days and I still do not understand what to do. I checked my syntax and it is working, but for some reason when I add mknode and printtree to it, it prints this message. Please help me.
void foo(int x, y, z; real f){
if (x>y) {
x = x + f;
}
else {
y = x + y + z;
x = f*2;
z = f;
}
This is my yacc file, including my function printtree and mknode.
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct node{
char *token;
struct node *left;
struct node *right;
}node;
node *mknode(char *token, node *left, node *right);
void printtree(node *tree);
%}
%union
{
char *s;
struct node *node;
}
%token IF ELSE INT CHAR VOID REAL RETURN GUI
%left '*'
%left '+'
%token <s> NUM ID FUNC
%type <node> S start function func args args1 body if_st ret_st expr block ass calc
%type <s> type
%%
S: start {printtree($1);};
start: function {$$ = mknode("CODE",$1,NULL);};
function: func { $$ = mknode("FUNC",$1, NULL); };
func: type ID '(' args ')' '{' body '}' {$$ = mknode($2,NULL, mknode("ARGS", $4,mknode($1, NULL,$7)));};
type: INT {$$ = "INT";}
| CHAR {$$ = "CHAR";}
| VOID {$$ = "VOID";}
| REAL {$$ = "REAL";};
args: type args1 args {$$ = mknode($1,$2,$3);} | type args1 {$$ = mknode($1,$2,NULL);} ;
args1: ID {$$ = mknode($1,NULL,NULL);}
| ID ';' {$$ = mknode($1,NULL,NULL);}
| ID ',' args1 {$$ = mknode($1,NULL,$3);}
| { $$ = NULL; };
body: if_st {$$ = mknode("BODY", $1, NULL);}
| ret_st {$$ = mknode("BODY", $1, NULL);};
if_st: IF'(' expr ')' '{'block'}' ELSE '{'block'}' {$$ = mknode("IF-ELSE",mknode(NULL,$3,mknode(NULL,$6,$10)),NULL);}
| IF '(' expr ')' '{'block'}'{$$ = mknode("IF",$3,$6);} ;
expr: ID '<' ID {$$ = mknode("<",mknode($1,NULL,NULL),mknode($3,NULL,NULL));}
| ID '>' ID {$$ = mknode(">",mknode($1,NULL,NULL),mknode($3,NULL,NULL));}
| ID '=' ID {$$ = mknode("==",mknode($1,NULL,NULL),mknode($3,NULL,NULL));}
| ID '<' NUM {$$ = mknode("<",mknode($1,NULL,NULL),mknode($3,NULL,NULL));}
| ID '>' NUM {$$ = mknode(">",mknode($1,NULL,NULL),mknode($3,NULL,NULL));}
| ID '=' NUM {$$ = mknode("==",mknode($1,NULL,NULL),mknode($3,NULL,NULL));};
block: block ass {$$ = mknode(NULL,$1,$2);}
| ass {$$ = mknode(NULL,$1,NULL);};
ass: ID '=' calc ';'{$$ = mknode("=",mknode($1,NULL,NULL),mknode(NULL,$3,NULL));};
calc: ID '+' calc {$$ = mknode("+",mknode($1,NULL,NULL),mknode(NULL,$3,NULL));}
| ID '*' calc {$$ = mknode("*",mknode($1,NULL,NULL),mknode(NULL,$3,NULL));}
| NUM '+' calc {$$ = mknode("+",mknode($1,NULL,NULL),mknode(NULL,$3,NULL));}
| NUM '*' calc {$$ = mknode("*",mknode($1,NULL,NULL),mknode(NULL,$3,NULL));}
| NUM {$$ = mknode($1,NULL,NULL);}
| ID {$$ = mknode($1,NULL,NULL);};
ret_st: RETURN GUI calc GUI ';' { $$ = mknode("RET", $3, NULL); };
%%
#include "lex.yy.c"
int main()
{
return yyparse();
}
node *mknode(char *token,node *left,node *right)
{
node *newnode = (node*)malloc(sizeof(node));
char *newstr = (char*)malloc(sizeof*(token)+1);
strcpy(newstr,token);
newnode->left = left;
newnode->right = right;
newnode->token = newstr;
return newnode;
}
void printtree(node *tree)
{
printf("%s\n",tree->token);
if(tree->left)
printtree(tree->left);
if(tree->right)
printtree(tree->right);
}
int yyerror()
{
printf("ERROR\n");
return 0;
}
Most likely cause of the crash:
you call mknode in a couple of places (eg, the block rule) with NULL as the first argument, but mknode calls strcpy with this argument as the source string, so it will crash
Other problems:
you use sizeof(token) where token is a char * (getting the size of a pointer, not the length of the string. You need strlen(token). Better yet, use strdup(token) to do the malloc+strcpy all in one.
your grammar is inflexible, with almost-dupliacted rules and limited nesting. You're better off using fewer rules -- get rid of all the calc/expr stuff and just have
expr: expr '+' expr
| expr '*' expr
| expr '<' expr
| expr '>' expr
| expr '=' expr
| ID
| NUM
| '(' expr ')'
and set precedence of your operators appropriately. Similarly block and body should be combined into one non-terminal and a couple of rules.

TYPO3 encrypted mailto-link (javascript:linkTo_UnCryptMailto) not working with subject and body

In TYPO3 mailto links are decrypted by the following code snippet.
Is there a way to use this with mailto links, which contain subject and body text?
e.g.: email#example.org?subject=This is my subject&body=This is my bodytext: more text...etc.
// decrypt helper function
function decryptCharcode(n,start,end,offset) {
n = n + offset;
if (offset > 0 && n > end) {
n = start + (n - end - 1);
} else if (offset < 0 && n < start) {
n = end - (start - n - 1);
}
return String.fromCharCode(n);
}
// decrypt string
function decryptString(enc,offset) {
var dec = "";
var len = enc.length;
for(var i=0; i < len; i++) {
var n = enc.charCodeAt(i);
if (n >= 0x2B && n <= 0x3A) {
dec += decryptCharcode(n,0x2B,0x3A,offset); // 0-9 . , - + / :
} else if (n >= 0x40 && n <= 0x5A) {
dec += decryptCharcode(n,0x40,0x5A,offset); // A-Z #
} else if (n >= 0x61 && n <= 0x7A) {
dec += decryptCharcode(n,0x61,0x7A,offset); // a-z
} else {
dec += enc.charAt(i);
}
}
return dec;
}
// decrypt spam-protected emails
function linkTo_UnCryptMailto(s) {
location.href = decryptString(s,-3);
}
if it does not run by default (maybe it depends on usage, from where to what app, but I remember that I used it already).
You might need to encode special characters for usage in URLs.
Try to use PHP function urlencode.
So you could replace all spaces with %20 or +.
Hmm, that works for me (TYPO3 v10).
TypoScript setup:
config.spamProtectEmailAddresses = -3
https://docs.typo3.org/m/typo3/reference-typoscript/master/en-us/Setup/Config/Index.html#spamprotectemailaddresses
Fluid:
<f:link.email email="my#email.tld?subject=123&body=Hello there!">link</f:link.email>
That opens my E-Mail-Client with subject and body (Firefox 84, Thunderbird).

Find duplicate words in two text files using command line

I have two text files:
f1.txt
boom Boom pow
Lazy dog runs.
The Grass is Green
This is TEST
Welcome
and
f2.txt
Welcome
I am lazy
Welcome, Green
This is my room
Welcome
bye
In Ubuntu Command Line I am trying:
awk 'BEGIN {RS=" "}FNR==NR {a[$1]=NR; next} $1 in a' f1.txt f2.txt
and getting output:
Green
This
is
My desired output is:
lazy
Green
This is
Welcome
Description: I want to compare two txt files, line by line. Then I want to output all duplicate words. The matches should be not case sensitive. Also, comparing line by line would be better instead of looking for a match from f1.txt in a whole f2.txt file. In example, the word "Welcome" should not be in desired output if it was on line 6 instead of line 5 in f2.txt
Well, then. With awk:
awk 'NR == FNR { for(i = 1; i <= NF; ++i) { a[NR,tolower($i)] = 1 }; next } { flag = 0; for(i = 1; i <= NF; ++i) { if(a[FNR,tolower($i)]) { printf("%s%s", flag ? OFS : "", $i); flag = 1 } } if(flag) print "" }' f1.txt f2.txt
This works as follows:
NR == FNR { # While processing the first file:
for(i = 1; i <= NF; ++i) { # Remember which fields were in
a[NR,tolower($i)] = 1 # each line (lower-cased)
}
next # Do nothing else.
}
{ # After that (when processing the
# second file)
flag = 0 # reset flag so we know we haven't
# printed anything yet
for(i = 1; i <= NF; ++i) { # wade through fields (words)
if(a[FNR,tolower($i)]) { # if this field was in the
# corresponding line in the first
# file, then
printf("%s%s", flag ? OFS : "", $i) # print it (with a separator if it
# isn't the first)
flag = 1 # raise flag
}
}
if(flag) { # and if we printed anything
print "" # add a newline at the end.
}
}

Pivot table in AWK

I need to transform elements from an array to column index and return the value of $3 for each column index.
I donĀ“t have access to gawk 4 so I cannot work with real multidimensional arrays.
Input
Name^Code^Count
Name1^0029^1
Name1^0038^1
Name1^0053^1
Name2^0013^3
Name2^0018^3
Name2^0023^5
Name2^0025^1
Name2^0029^1
Name2^0038^1
Name2^0053^1
Name3^0018^1
Name3^0060^1
Name4^0018^2
Name4^0025^5
Name5^0018^2
Name5^0025^1
Name5^0060^1
Desired output
Name^0013^0018^0023^0025^0029^0038^0053^0060
Name1^^^^^1^1^1^
Name2^3^3^5^1^1^1^1^
Name3^^1^^^^^^1
Name4^^2^^5^^^^
Name5^^^^1^^^^1
Any suggestions on how to tackle this task without using real multidimensional arrays?
The following solution uses GNU awk v3.2 features for sorting. This does not use multi-dimensional arrays. It only simulates one.
awk -F"^" '
NR>1{
map[$1,$2] = $3
name[$1]++
value[$2]++
}
END{
printf "Name"
n = asorti(value, v_s)
for(i=1; i<=n; i++) {
printf "%s%s", FS, v_s[i]
}
print ""
m = asorti(name, n_s)
for(i=1; i<=m; i++) {
printf "%s", n_s[i]
for(j=1; j<=n; j++) {
printf "%s%s", FS, map[n_s[i],v_s[j]]
}
print ""
}
}' file
Name^0013^0018^0023^0025^0029^0038^0053^0060
Name1^^^^^1^1^1^
Name2^3^3^5^1^1^1^1^
Name3^^1^^^^^^1
Name4^^2^^5^^^^
Name5^^2^^1^^^^1
This will work with any awk and will order the output of counts numerically while keeping the names in the order they occur in your input file:
$ cat tst.awk
BEGIN{FS="^"}
NR>1 {
if (!seenNames[$1]++) {
names[++numNames] = $1
}
if (!seenCodes[$2]++) {
# Insertion Sort - start at the end of the existing array and
# move everything greater than the current value down one slot
# leaving open the slot for the current value to be inserted between
# the last value smaller than it and the first value greater than it.
for (j=++numCodes;codes[j-1]>$2+0;j--) {
codes[j] = codes[j-1]
}
codes[j] = $2
}
count[$1,$2] = $3
}
END {
printf "%s", "Name"
for (j=1;j<=numCodes;j++) {
printf "%s%s",FS,codes[j]
}
print ""
for (i=1;i<=numNames;i++) {
printf "%s", names[i]
for (j=1;j<=numCodes;j++) {
printf "%s%s",FS,count[names[i],codes[j]]
}
print ""
}
}
...
$ awk -f tst.awk file
Name^0013^0018^0023^0025^0029^0038^0053^0060
Name1^^^^^1^1^1^
Name2^3^3^5^1^1^1^1^
Name3^^1^^^^^^1
Name4^^2^^5^^^^
Name5^^2^^1^^^^1
Since you only have two "dimensions", it is easy enough to use one array for each dimension and a joining array with a calculated column name. I didn't do the sorting of columns or rows, but the idea is pretty basic.
#!/usr/bin/awk -f
#
BEGIN { FS = "^" }
(NR == 1) {next}
{
rows[$1] = 1
columns[$2] = 1
join_table[$1 "-" $2] = $3
}
END {
printf "Name"
for (col_name in columns) {
printf "^%s", col_name
}
printf "\n"
for (row_name in rows) {
printf row_name
for (col_name in columns) {
printf "^%s", join_table[row_name "-" col_name]
}
printf "\n"
}
}

Where does the "newline" (\n) come from? (pattern matching using "flex")

I have an experimental flex source file(lex.l):
%option noyywrap
%{
int chars = 0;
int words = 0;
int lines = 0;
%}
delim [ \t\n]
ws {delim}+
letter [A-Za-z]
digit [0-9]
id {letter}({letter}|{digit})*
number {digit}+(.{digit}+)?(E[+-]?{digit}+)?
%%
{letter}+ { words++; chars += strlen(yytext); printf("Word\n"); }
\n { chars++; lines++; printf("Line\n"); }
. { chars++; printf("SomethingElse\n"); }
%%
int main(argc, argv)
int argc;
char **argv;
{
if(argc > 1)
{
if(!(yyin = fopen(argv[1], "r")))
{
perror(argv[1]);
return (1);
}
}
yylex();
printf("lines: %8d\nwords: %8d\nchars: %8d\n", lines, words, chars);
}
I created an input file called "input.txt" with "red apple" written in it. Command line:
$ flex lex.l
$ cc lex.yy.c
$ ./a.out < input.txt
Word
SomethingElse
Word
Line
lines: 1
words: 2
chars: 10
Since there is no newline character in the input file, why the "\n" in lex.l is pattern matched? (The "lines" is supposed to be 0, and the "chars" is supposed to be 9)
(I am using OS X.)
Thanks for your time.
It is very possible that your text editor has automatically inserted a newline at the end of the file.

Resources