pvaneynd: (Default)
pvaneynd ([personal profile] pvaneynd) wrote in [personal profile] simont 2016-06-22 10:47 am (UTC)

At my work we often have to extract data from outputs our customer send us. More or less mutilated.

People often try to use regular expressions to parse this output which actually has a grammar. Instead of using a parsing tool to understand the grammar they write 'quick' regexp patterns which gets the data they are interested in.

Then they discover that another version give the data in a slightly different way. Another platform again slightly different. In the end the 'simple' regexp becomes a tangled mess of linenoise. For bonus points this pattern often has to ignore line endings and will be unbound, then applied on multi-megabyte files, in a loop.

Going for the simpler parser would have been much easier in the long run. Or at least a sane middle way like textfsm.

Post a comment in response:

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting