parsing - Bison shift reduce conflict on comma -


i have strange shift-reduce warnings no matter change. reduced grammar:

expr : number       | number ',' expr       | expr ',' number ',' number 

bison reports shift reduce on 2nd rule comma. tried set precedence:

%nonassoc num_p %nonassoc exp_p  expr : number %prec num_p      | number ',' expr %prec exp_p       | expr ',' number ',' number 

but warning stays same. can explain me missing here?

it clear following ambiguous:

expr : number %prec num_p      | number ',' expr %prec exp_p       | expr ',' number ',' number 

since list of 3 or more numbers can parsed in various ways. speaking, can take single numbers off beginning of list or pairs of numbers off end of list, until meet somewhere in middle; however, there no definition of middle point might be.

consider, example, various parse trees produce 1, 2, 3, 4, 5. here 2 (with numbers indicating production used expand expr):

    expr(2)                       expr(3)     /    \                       /   |  \   /       \                     /    |   |    |    expr(2)                  /     |   |    |     /   \                  /      |   |  |    /     \             expr(3)    |   |  |   /    expr(3)          / | \     |   |  |   |    /  | \          /  |  \    |   |  |   |expr(1)|  \     expr(1)|   |   |   |  |   |   |   |   |       |   |   |   |   |  1 , 2 , 3 , 4 , 5       1 , 2 , 3 , 4 , 5 

both of above trees are, in sense, maximal. 1 on left takes many single numbers possible using production 2, until 2 numbers left production 3. 1 on right applies production 3 many times possible. (it have needed single application of production 2 if list of numbers had length.)

in order resolve ambiguity, need clear statement of intent. seems me unlikely can resolved precedence declaration. remember precedence relations between possible reduction (on top of parser stack) , lookahead symbol. never compare 2 lookahead symbols nor 2 productions. if lookahead symbol wins, shifted onto stack; if reduction wins, stack reduced. no more complicated that.

so if precedence help, relevant token must ',', not number. number must shifted onto parse stack. since no production ends ',', never possible reduce stack when number lookahead symbol. contrast, when ',' lookahead symbol, possible either reduce top of parser stack or shift ',' in preparation different reduction.

the place decision possible in case have seen number , looking @ ',', have decide whether apply production 1, in preparation production 3, or shift ',', leaving production 2 option. neither of these decisions can prosper: if choose reduce, possible production 3 turn out impossible because there not enough numbers in list; if choose shift, production 3 never used.

in left-to-right parse, algorithm produces right-hand parse above not possible, because cannot know whether list of or odd length until reach end, @ point far late retroactively reduce first 2 numbers. on other hand, left-hand parse require lookahead of 4 tokens, not one, in order decide @ point stop using production 2. makes possible construct lr(1) grammar, because there way rewrite lr(k) grammar lr(1) grammar, resulting grammar complicated.

i suspect neither of these intention. in order recommend resolution, necessary know precise intention is.

one possibility (motivated comment) expr includes neither number nor list of numbers:

expr: number     | complex_expression 

in case, might grammar intends capture 2 possibilities:

  • a list containing numbers possibly complex_expression @ end;

  • a list containing number of numbers possibly complex_expression @ beginning.

what left ambiguous in above formulation list consisting of numbers, since either first or last number parsed expr. here there couple of reasonable possible resolutions:

  • a list of numbers parsed first option (expr @ end)

  • a list of numbers parsed second option (expr @ beginning) if , if there odd number of elements in list.

the first resolution easier, can start it. in case, first element in list determines how list parsed, not possible reduce first number expr. can express separating different types of expr:

expr: list_starting_expr | list_ending_expr list_starting_expr: complex_expression ',' number ',' number                   | list_start_expr ',' number ',' number list_ending_expr  : complex_expression                   | number ',' list_ending_expr                    | number 

the last production in above example allows list entirely of numbers parsed list_ending_expr.

it allows list containing single complex_expression parsed list_ending_expr, while list_starting_expr required have @ least 3 elements. without that, list consisting of complex_expression have been ambiguous. in example grammar in question, list containing complex_expression implicitly forbidden; reproduced changing base production list_ending_expr list_ending_expr: complex_expression list_ending_expr: number ',' complex_expression.

but if wanted second resolution? can still recognize language, constructing correct parse tree may require surgery. can start separating out case list consists of numbers.

expr: list_starting_expr | list_ending_expr | ambiguous_list list_starting_expr: complex_expression ',' number ',' number                   | list_starting_expr ',' number ',' number list_ending_expr  : number ',' complex_expression                   | number ',' list_ending_expr  ambiguous_list    : number                   | number ',' ambiguous_list 

despite frequently-repeated claim right-recursion should avoided in bottom-up grammars, here essential ambiguous_list , list_ending_expr right-recursive, because cannot distinguish between 2 possibilities until reach end of list.

now use semantic action count number of elements in list. action needs associated reduction of ambiguous_list expr:

expr: list_starting_expr | list_ending_expr     | ambiguous_list {         if (count_elements($1) % 2 == 1) {           $$ = make_list_starting_expr($1);         }         else {           $$ = make_list_starting_expr($1);         }       } 

but can distinguish 2 cases grammatically, precisely because of right recursion:

expr: list_starting_expr     | list_ending_expr     | even_list_of_numbers     | odd_list_of_numbers list_starting_expr  : complex_expression ',' number ',' number                     | list_starting_expr ',' number ',' number list_ending_expr    : number ',' complex_expression                     | number ',' list_ending_expr  odd_list_of_numbers : number                     | number ',' number ',' odd_list_of_numbers even_list_of_numbers: number ',' number                      | number ',' number ',' even_list_of_numbers 

it might more meaningful write as:

expr: expr_type_one | expr_type_two expr_type_one: list_starting_expr | even_list_of_numbers expr_type_two: list_ending_expr | odd_list_of_numbers  /* remainder above */ 

note: above grammars, grammar in original question, not allow expr consist of complex_expression. if desired handle case of single complex_expression, syntax added directly productions expr (or whichever of expr_type_one or expr_type_two appropriate.


Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -