Erlang Challenge: Level 4

By anders pearson 01 Sep 2006

(Read [[Erlang Challenge: Level 1]], [[Erlang Challenge: Level 2]], [[Erlang Challenge: Level 2 (2)]], and [[Erlang Challenge: Level 3]] before reading this post or it won’t make sense)

Level 4 of the challenge gets a little meatier. The challenge page has a url that ends like ‘linkedlist.php?nothing=12345’ and the body of the page is just the text “the next nothing is 098093”. You put that number into the url and it gives you another page with the same text but a different number. So the challenge is to write a script that does an HTTP GET request, parses the next “nothing” out of the body of the response, constructs the new url and repeats until it reaches the end of the chain.

Parsing the number out of the text shouldn’t be any harder than the previous challenge, so first we just tackle that in its own function:

:::erlang
extract_nothing(Body) ->
    case regexp:first_match(Body,"the next nothing is [0-9]+") of
        {match,Start,Length} ->
            string:substr(Body,Start + 20,Length - 20);
        nomatch ->
            io:format("no match found in body text: ~s~n",[Body])
    end.

Again, we’re just using the regexp library to look for a match and then string:substr/3 to pull it out. This time there’s a little bit of error checking though in the form of the case statement. regexp:first_match/2 can return either ‘{match,Start,Length}’ or ‘nomatch’ when it doesn’t find anything. The case statement just lets us easily switch on the patterns and print out a message with the full body text when it doesn’t find a match. That condition would presumably mark the end of the chain that we’re following.

Now, the part of this challenge that really was useful for me was that the script has to make GET requests. I consider making HTTP requests to be a fundamental operation these days. The “microapp” architecture that I’ve been pushing lately involves exposing as much as possible over a REST/HTTP interface. Once I can make HTTP requests in Erlang, it means that I now have available all the functionality that I’ve previously exposed as a microapp. That means easy shared storage, tagging, threaded comments, full-text search, image processing, and a number of other special purpose microapps that I and other people have built. It instantly makes it more feasable for me to start building real web applications in Erlang.

It took me a while with google to find it, but Erlang does include a pretty decent HTTP client module. It’s pretty similar to python’s urllib or httplib2 libraries. The only difference is that it actually does the requests in a seperate Erlang process to allow for nice asyncronous requests. This actually seems to be typical of how Erlang interacts with any outside system. There’s always a special process that gets spawned that acts as a proxy. So your code just interacts with it like a regular Erlang process. That’s an important concept to understand, but for the purpose of just making a regular synchronous GET request, it basically just means that we have to call the:

:::erlang
application:start(inets).

function once to spawn the process before we can make any HTTP requests. Other than that, it’s just regular function calls in the http module.

It didn’t take long after finding the documentation to make a function which takes one “nothing” fetches the page, extracts the next one from the body text and returns it:

:::erlang
next_nothing(Nothing) ->
    Url = "http://www.pythonchallenge.com/.../linkedlist.php?nothing=" ++ Nothing,
    {ok, {{_, 200, _}, _, Body}} = http:request(get, {Url, []}, [], []),
    extract_nothing(Body).

It constructs the url, makes the request, uses pattern matching and liberal use of the ‘_’ “don’t care” variable to pull the body text out of the response tuple, and then uses the previous extract_nothing/1 function to parse it and get the next number.

Reading the full documentation for the http module I realized that I could simplify the http:request line to just:

:::erlang
{ok, {{_, 200, _}, _, Body}} = http:request(Url),

since a GET request with no extra stuff is just the default.

Error handling could be added with another case statement that just caught anything that didn’t come back with ‘ok’ as the first element of the tuple and printed a message or something. I’m pretty much expecting to not encounter errors here though so we’ll skip it.

All that’s left now is to wrap this in a “loop” now to have it repeat until it hits a page that it can’t parse a “nothing” out of:

:::erlang
all_nothings(Nothing,Count) when Count > 300 ->
    io:format("finished on: ~s~n",[Nothing]);
all_nothings(Nothing,Count) ->
    io:format("getting nothing: ~s~n",[Nothing]),
    NewNothing = next_nothing(Nothing),
    all_nothings(NewNothing,Count + 1).

all_nothings(Nothing) -> all_nothings(Nothing,0).

A hint on the challenge page says that it shouldn’t take more than 300 steps to get to the end, so I put in a counter to keep track of how many requests we’d made and put a guard on the first clause to stop it at 300. (Looking at the discussion of the challenge afterwards, I learned that this hint was there because there was a trick in the challenge. If you didn’t match on the whole phrase and just pulled out the first number from the page, one of them sends you off on a different chain that turns into a loop and will go on forever.)

The second clause does the main work, printing out the current “nothing”, getting the next via next_nothing/1 and then incrementing the counter and recursing. all_nothings/1 is just a driver function added for convenience that starts the whole thing with a counter of 0.

This level of the challenge has one or two other little snags that make it tricky, but I won’t mention them since they didn’t involve any programming to get around.

I’m happy to see that making HTTP requests in Erlang is about as painless as I’m used to with other languages. That really opens it up for me as a legitimate option for my own work.

The only mildly disappointing thing I discovered with this level (and it sort of showed up earlier, too) is that google doesn’t seem to have done a very good job of indexing the Erlang documentation that’s out there. Once I got a little comfortable with the language, I’ve found the terse format of the documentation to be perfectly adequate, but it’s a little scattered so you have to know where to look for something. Googling something like “erlang http module” doesn’t really turn up anything useful. Instead, I had to browse around the documentation site to find an index of the modules and scan the list looking for one that would be appropriate. Now that I know that that’s what I have to do, it’s not to bad, but it would be nice if Erlang docs were a little more googlable.

Oh, I should also mention that Level 5 of the challenge does indeed involve writing python code. Specifically, you have to reconstruct a pickled python object, so it isn’t really doable in Erlang. I’ll do that one in python so the next entry in this series will be Level 6. It may be a while before I get to it though since I won’t have internet access in my new apartment until later next week.

Tags: erlang erlang challenge python challenge http programming microapps