Ruby – Very simple SEXP parser

For an assignment, we must implement an input like a very basic sexp parser, such as:

 "((ab) ((cd) e) f)"

It will return:

[["a", "b"], [["c", "d"], "e"], "f"]

Since this is part of a larger task, the parser is only given valid input (matching parens& c) . I proposed the following solution in Ruby:

def parse s, start, stop
tokens = s.scan(/#{Regexp.escape(start )}|#{Regexp.escape(stop)}|\w+/)

stack = [[]]

tokens.each do |tok|
case tok
when start
stack << []
when stop
stack[-2] << stack.pop
else
stack[-1 ] << tok
end
end

return stack[-1][-1]
end

This may not be the best Solution, but it can do the job.

Now, I am interested in the core function of an idiomatic Haskell solution (i.e. I don’t care about lexing or choosing separators, considering that the input already lexed will be very Ok), if it is possible to use only the “core” Haskell, no extensions or libraries like parsec.

Please note that this is not part of the assignment, I am only interested in Haskell’s handling.

[["a", "b"], [["c", "d" ], "e"], "f"]

There is no valid type in Haskell (because all elements in the list must be in Haskell Have the same type), so you need to define your own data structure for the nested list, as shown below:

data NestedList = Value String | Nesting [NestedList]

Now, if you have a list of tokens, where Token is defined as a data Token = LPar | RPar | symbol string, you can parse it as a NestedList as shown below:

parse = fst .parse'

parse' (LPar: tokens) =
let (inner, rest) = parse' tokens
(next, outer) = parse' rest
in
(Nesting inner: next, outer)
parse' (RPar: tokens) = ([], tokens)
parse' ((Symbol str) : tokens) =
let (next, outer) = parse' tokens in
(Value str: next, outer)
parse' [] = ([],[])

For an assignment, we must implement an input like a very basic sexp parser, such as:

"((ab) ((cd) e) f)"

It will return:

[["a", "b"] , [["c", "d"], "e"], "f"]

Since this is part of a larger task, the parser is only given valid input (matching parens& c ). I proposed the following solution in Ruby:

def parse s, start, stop
tokens = s.scan(/#{Regexp.escape( start)}|#{Regexp.escape(stop)}|\w+/)

stack = [[]]

tokens.eac h do |tok|
case tok
when start
stack << []
when stop
stack[-2] << stack.pop
else
stack[-1] << tok
end
end

return stack[-1][-1]
end

This may not be the best solution, but it can do the job.

Now, I am interested in the core features of an idiomatic Haskell solution (i.e. I don’t care about lexing or choosing Separator, it would be good to consider the input already lexed), if it is possible to use only the “core” Haskell, no extensions or libraries like parsec.

Please note that this is not part of the assignment, I just give Haskell Is interested in how to handle it.

[["a", "b"], [["c", "d"], " e"], "f"]

There is no valid type in Haskell (because all elements in the list must have the same type in Haskell), so you need to define your own data structure for the nested list , As follows:

data NestedList = Value String | Nesting [NestedList]

Now, if you have a token list, where Token is defined as Data Token = LPar | RPar | Symbol string, you can parse it into NestedList, as shown below:

parse = fst. Parse'

parse' (LPar: tokens) =
let (inner, rest) = parse' tokens
(next, outer) = parse' rest
in
(Nesting inner: next , outer)
parse' (RPar: tokens) = ([], tokens)
parse' ((Symbol str): to kens) =
let (next, outer) = parse' tokens in
(Value str: next, outer)
parse' [] = ([],[])

Leave a Comment

Your email address will not be published.