Parsing Expression Grammar
      
      
        Parsing expression grammars (PEGs) are an analytic grammar formalism.
      
      
        They feature the following operators:
      
      
        - 
           sequencing (e1 e2)
        
 
        - 
           unambiguous choice  (e1/e2)
        
 
        - 
           (greedy) repetition (e*)
        
 
        - 
           option (e?)
        
 
        - 
           one-or-more (e+)
        
 
        - 
           positive lookahead (&e)
        
 
        - 
           negative lookahead (!e)
        
 
      
      
        They are:
      
      
        - 
           fast to parse (O(n) in time and space, where n is the size of the input)
        
 
        - 
           unambiguous by construction
        
 
        - 
           closed under composition, which allows modular grammars
        
 
        - 
           more expressive than context-free grammars (they can parse L="a"^n"b"^n"c"^n)
        
 
      
      
        However, they come with their share of problems:
      
      
        - 
           PEGs can't directly handle left-recursion (not much of a problem in practice)
        
 
        - 
           PEGs suffer from so called "prefix capturing": "a*a" will never match any input
          
            - 
               (this is what you get for defining away ambiguity)
            
 
          
         
        - 
           PEGs are bad for specifying languages: just guess what language "S <- aSa / aa" parses
          
            - 
               for the lazy: A string of "a"s with a length which is a power of two!
            
 
          
         
        - 
           PEGs can't generate strings, only parse them (due to lookahead operators)
          
            - 
               it seems to me that one could 'enforce' discovery of whatever one looked ahead to find
            
 
            - 
               Generating "(&A)B" would involve finding the intersection of A and B. This is an undecidable problem: If it would be possible to find the intersection of two PEGs we would not have issues detecting prefix capture. Prefix-capture is present in "A/B" iff the language generated by "(&A)B" is not empty. See [1] for more information.
            
 
          
         
      
      
        [1] http://charybde.homeunix.org/~schmitz/pub/modular.pdf
      
      
      
        CategoryCompilers