thraxil.org:

Erlang Challenge: Level 4

by anders pearson Fri 01 Sep 2006 16:03:28

(Read [[Erlang Challenge: Level 1]], [[Erlang Challenge: Level 2]], [[Erlang Challenge: Level 2 (2)]], and [[Erlang Challenge: Level 3]] before reading this post or it won't make sense)

Level 4 of the challenge gets a little meatier. The challenge page has a url that ends like 'linkedlist.php?nothing=12345' and the body of the page is just the text "the next nothing is 098093". You put that number into the url and it gives you another page with the same text but a different number. So the challenge is to write a script that does an HTTP GET request, parses the next "nothing" out of the body of the response, constructs the new url and repeats until it reaches the end of the chain.

Parsing the number out of the text shouldn't be any harder than the previous challenge, so first we just tackle that in its own function:

extract_nothing(Body) ->
    case regexp:first_match(Body,"the next nothing is [0-9]+") of
        {match,Start,Length} ->
            string:substr(Body,Start + 20,Length - 20);
        nomatch ->
            io:format("no match found in body text: ~s~n",[Body])
    end.

Again, we're just using the regexp library to look for a match and then string:substr/3 to pull it out. This time there's a little bit of error checking though in the form of the case statement. regexp:first_match/2 can return either '{match,Start,Length}' or 'nomatch' when it doesn't find anything. The case statement just lets us easily switch on the patterns and print out a message with the full body text when it doesn't find a match. That condition would presumably mark the end of the chain that we're following.

Now, the part of this challenge that really was useful for me was that the script has to make GET requests. I consider making HTTP requests to be a fundamental operation these days. The "microapp" architecture that I've been pushing lately involves exposing as much as possible over a REST/HTTP interface. Once I can make HTTP requests in Erlang, it means that I now have available all the functionality that I've previously exposed as a microapp. That means easy shared storage, tagging, threaded comments, full-text search, image processing, and a number of other special purpose microapps that I and other people have built. It instantly makes it more feasable for me to start building real web applications in Erlang.

It took me a while with google to find it, but Erlang does include a pretty decent HTTP client module. It's pretty similar to python's urllib or httplib2 libraries. The only difference is that it actually does the requests in a seperate Erlang process to allow for nice asyncronous requests. This actually seems to be typical of how Erlang interacts with any outside system. There's always a special process that gets spawned that acts as a proxy. So your code just interacts with it like a regular Erlang process. That's an important concept to understand, but for the purpose of just making a regular synchronous GET request, it basically just means that we have to call the:

application:start(inets).

function once to spawn the process before we can make any HTTP requests. Other than that, it's just regular function calls in the http module.

It didn't take long after finding the documentation to make a function which takes one "nothing" fetches the page, extracts the next one from the body text and returns it:

next_nothing(Nothing) ->
    Url = "http://www.pythonchallenge.com/.../linkedlist.php?nothing=" ++ Nothing,
    {ok, {{_, 200, _}, _, Body}} = http:request(get, {Url, []}, [], []),
    extract_nothing(Body).

It constructs the url, makes the request, uses pattern matching and liberal use of the '_' "don't care" variable to pull the body text out of the response tuple, and then uses the previous extract_nothing/1 function to parse it and get the next number.

Reading the full documentation for the http module I realized that I could simplify the http:request line to just:

{ok, {{_, 200, _}, _, Body}} = http:request(Url),

since a GET request with no extra stuff is just the default.

Error handling could be added with another case statement that just caught anything that didn't come back with 'ok' as the first element of the tuple and printed a message or something. I'm pretty much expecting to not encounter errors here though so we'll skip it.

All that's left now is to wrap this in a "loop" now to have it repeat until it hits a page that it can't parse a "nothing" out of:

all_nothings(Nothing,Count) when Count > 300 ->
    io:format("finished on: ~s~n",[Nothing]);
all_nothings(Nothing,Count) ->
    io:format("getting nothing: ~s~n",[Nothing]),
    NewNothing = next_nothing(Nothing),
    all_nothings(NewNothing,Count + 1).

all_nothings(Nothing) -> all_nothings(Nothing,0).

A hint on the challenge page says that it shouldn't take more than 300 steps to get to the end, so I put in a counter to keep track of how many requests we'd made and put a guard on the first clause to stop it at 300. (Looking at the discussion of the challenge afterwards, I learned that this hint was there because there was a trick in the challenge. If you didn't match on the whole phrase and just pulled out the first number from the page, one of them sends you off on a different chain that turns into a loop and will go on forever.)

The second clause does the main work, printing out the current "nothing", getting the next via next_nothing/1 and then incrementing the counter and recursing. all_nothings/1 is just a driver function added for convenience that starts the whole thing with a counter of 0.

This level of the challenge has one or two other little snags that make it tricky, but I won't mention them since they didn't involve any programming to get around.

I'm happy to see that making HTTP requests in Erlang is about as painless as I'm used to with other languages. That really opens it up for me as a legitimate option for my own work.

The only mildly disappointing thing I discovered with this level (and it sort of showed up earlier, too) is that google doesn't seem to have done a very good job of indexing the Erlang documentation that's out there. Once I got a little comfortable with the language, I've found the terse format of the documentation to be perfectly adequate, but it's a little scattered so you have to know where to look for something. Googling something like "erlang http module" doesn't really turn up anything useful. Instead, I had to browse around the documentation site to find an index of the modules and scan the list looking for one that would be appropriate. Now that I know that that's what I have to do, it's not to bad, but it would be nice if Erlang docs were a little more googlable.

Oh, I should also mention that Level 5 of the challenge does indeed involve writing python code. Specifically, you have to reconstruct a pickled python object, so it isn't really doable in Erlang. I'll do that one in python so the next entry in this series will be Level 6. It may be a while before I get to it though since I won't have internet access in my new apartment until later next week.

TAGS: programming http erlang python challenge erlang challenge microapps

comments

Hi Anders,

I stumbled across your blog while searching for something or rather to do with Erlang. Thanks! The python challenge is great idea and good practice for Erlang programmer wannabes like myself. I'm curious, to solve this one I came up against a couple of obstacles you didn't mention:

[SPOILER ALERT START]

One of the pages spouted "Yes. Divide by two and keep going.". The other thing was that I would randomly get {error,session_remotly_closed} exceptions.

[SPOLIER ALERY END]

I haven't looked yet, but how can I skip a level? And what ever happened to your progress through the python challenge?

Cheers, merlyn

There are forums with the Python challenge. If you poke around in there, you can usually figure out how to skip a level.

I stalled out on the level that requires Python programming. I have enough actual, useful Python projects going that I feel like if I'm going to write Python in my spare time, it should be on one of those. I can't bring myself to cheat and skip the level either.

I'm also just feeling pretty confident with sequential Erlang programming now and have been spending more time exploring the more interesting concurrency stuff and Mnesia and OTP, writing actual, useful code. So I'm not sure if doing more of the Erlang challenge will really help me as much. At this point I think my time is better spent on larger projects that might actually see production use. It was definitely a useful exercise though and I recommend it for someone still looking to get comfortable with Erlang syntax and basic sequential programming.


formatting is with Markdown syntax. Comments are not displayed until they are approved by a moderator. Moderators will not approve unless the comment contributes value to the discussion.

namerequired
emailrequired
url
remember info?