Factor example program

For the CLI project season one, I document and explain a little sample program using the Factor language.

 

To kick this season off I've decided to create a small example program in Factor to show off some of the things in it. I assume you have read the 'Your first program' part in the documentation.

 

We're going to write a little Twitter search program.

This will involve all sorts of things like, http, json, stack manipulation, refactoring etc...

I've already done this exercise in a myriad of language and consider it to be a good first "Hello World!" project for any language.

 

Starting out

First of all, we need to create a vocabulary for our little project to go in.

To see what your current vocab-roots are, use this:

IN: scratchpad   USE: vocabs.loader

IN: scratchpad   vocab-roots get .

You can add a new one if you like, but usually you'll work in the 'work' vocab-root.

So let's go ahead and create our vocab:

IN: scratchpad   USE: tools.scaffold

IN: scratchpad   "resource:work" "twitter-search" scaffold-vocab

Now use your editor to open the 'twitter-search.factor' file that has been created there. (in the work subdir of your factor installation).

You should see something like this:

 

! Copyright (C) 2013 Your name.
! See http://factorcode.org/license.txt for BSD license.
USING: ;
IN: twittersearch

 

(for those of you who have experience with Common Lisp, this IN: is what you think it is.)

 

Exploring the documentation

Ok, so we're going to explore a bit now, I'll be guiding you through the first steps before assuming you know what jumps I'm making in the documentation. We need a way of downloading something from the internet. Open up the help from your listener and in the search field type: http.

You'll find some words who have seemingly nothing to do with this, and then some vocabs: "http", "http.server", "http.client", and a couple of others. We're interested in the "http.client" one.

So click that, we'll take a look at its documentation.

 

The factor help will tell you here: "Not loaded" and "You must first load this vocabulary to browse its documentation and words."

Right below that, you'll see "USE: http.client". This will be clickable, so go ahead and click that.

Now that line is copied to your listener, go there with your mouse and press ENTER

Some text will appear in the listener (showing you what other vocabularies are loaded to be able to use the one you asked for) and now the full documentation for http.client is available in the help screen.

Since we just want to download something in the simplest way possible, find the 'http-get' word and click that.

(that's all you need to know to browse the documentation for now, so just follow the links in the documentation as I mention them)

 

So http-get takes either a URL or a string and returns a response and data. Since we don't really have a good idea of what these are, let's just test it out.

IN: scratchpad "http://www.example.com" http-get

--- Data stack:
T{ response f "1.1" 200 "OK" H{ ~array~ ~array~ ~array~ ~array~...
"<!doctype html>\n<html>\n<head>\n\t<title>Example Domain</title>\n..."
IN: scratchpad

Now you have some T{ ... } thing and a string on the stack.

The T{ ... } thing is the response and the string is the data. Don't worry too much about the response part for now, we don't need it anyway. Feel free to read up on tuples later.

Use  to show what the data really looks like. (  prints the top value of the stack and consumes it).

Now you see the entire html with javascript etc like you would if you had used wget or some other downloader.

The 'response' stays on the data stack, we don't need it so you can just drop it.

 

Ok, you've seen it in action, let's create a proper word for it now.

 

: get-http-data ( url -- data ) http-get swap drop ;

 

Notice the use of the swap word here. It's a shuffle word, used to manipulate the stack.
You can read more about shuffle words in the documentation.

Since here we don't want the response object, only the data, we swap response with the data, so it's on top and then drop it. So you are left with just the data.

The combination of swap and drop is something that gets used quite often, so there is a shuffle word especially for that called nip. Let's use that word in our definition of get-http-data:

 

: get-http-data ( url -- data ) http-get nip ;

There are a whole lot of shuffle words and knowing them all will take some time. Ask around in #concatenative or on #yfl if you get stuck or want a code review.

 

Constants

Since the first part of the URL will always be the same, we're going to save it in a constant.
Constants are words that push a value on the stack when they are executed. There is a little bit of syntactic sugar in Factor to create them: CONSTANT:

 

CONSTANT: fixed-url-part "http://search.twitter.com/search.json?q="


Note that we're using the old v1 API of the twitter search. It's deprecated, but for this example it'll do. Hopefully I remember to correct this article when the v1 API finally dissapears.

Update: The v1 API is now gone. But, you can use services like http://www.supertweet.net/ or  https://foauth.org/ that can do the OAuth authentication for you. So create an account and adjust the urls in this tutorial accordingly.

Now that we have that, a word that creates the final URL would be handy.
For this we'll need string concatenation. Strings in Factor are sequences, so we need to look in the documentation for sequence operations.
Appending sequences is what we're going to do.  Append looks exactly what we're looking for.


First add the 'sequences' vocab to your USING: line:
USING: http.client kernel sequences ;
then add this word:
: url ( str -- str ) fixed-url-part swap append ;

If you test this in the listener you should see something like this:

IN: scratchpad "concatenative" url .
"http://search.twitter.com/search.json?q=concatenative"


Now the total file should be:

! Copyright (C) 2013 Your name.
! See http://factorcode.org/license.txt for BSD license.
USING: http.client kernel sequences ;
IN: twittersearch

CONSTANT: fixed-url-part "http://search.twitter.com/search.json?q="

: get-http-data ( url -- data ) http-get nip ;

: url ( str -- str ) fixed-url-part swap append ;


If we combine this, we can already query twitter and see the raw json.

IN: scratchpad "concatenative" url get-http-data

--- Data stack:
"{\"completed_in\":0.072,\"max_id\":319054328352612352,\"max_id_str..."
IN: scratchpad

 

JSON

The JSON of course arrives completely in one big string. It would be great if we could use some sort of Factor representation of it. Luckily this exists: json.reader is the vocab and json> is the word we are looking for.
json> takes a string and outputs a hashtable.
(A lot of factor words have naming conventions. http://docs.factorcode.org/content/article-conventions.html)

Add json.reader to your USING: line.

USING: http.client kernel sequences json.reader ;

Since the data is still on the stack from last time, we can simply try this in the listener:

IN: scratchpad USE: json.reader
IN: scratchpad json>

--- Data stack:
H{ { "page" 1 } { "results_per_page" 15 } { "max_id_str"...
IN: scratchpad

 

Hashtables

Now we have a hashtable on the stack. Hashtables implement the Associative mapping protocol.
We don't need to go in depth here, what we need is the at* word to get values from the hashtable.

IN: scratchpad "results" swap at*

--- Data stack:
{ H{ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~...

t


Now we have a sequence of hashtables with the results in, and the boolean that tells us whether the at* was successful. We'll assume this always works to simplify this demo, and drop that value.

 

IN: scratchpad drop

--- Data stack:
{ H{ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~...


We'll need to map over this sequence and print out every tweet.
For now we'll use the first one to experiment on: (http://docs.factorcode.org/content/word-first,sequences.html)

 

IN: scratchpad first

--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
IN: scratchpad


Notice how well Factor lends itself to exploratory programming.
There are a lot of things we could do with this tweet, but we'll just extract the username and the text of the tweet and show that as output.
A tweet here is again contained in a hashtable, so we just need to access the right keys, "from_user" and "text".
But if we get the first value, we lose the original hashtable. So let's first dup it so we can later get the second key.

This is the second time we'll use the "string, swap, drop" operation, so let's factor it out into its own word.
We need to update the USING: line:

USING: assocs http.client json.reader kernel sequences ;


And define the word.

: get-value ( hsh str -- str ) swap at* drop ;


Keep in mind this will be potentially dangerous because if it fails you won't know about it.
But the Twitter API is quite stable and you can expect these values to be there. Also, to keep this post from growing too large, I'll leave the error handling out.

Now we can do this:
 

--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
IN: scratchpad "from_user" get-value

--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
"lukego"
IN: scratchpad


We can now define words to get the fields we need. (might be a bit overkill, but it's to show you what is possible and how readable it can become)

: get-user ( hsh -- str ) "from_user" get-value ;

: get-text ( hsh -- str ) "text" get-value ;


Let's try to write the word to get both user and text on the stack.
Naively we could do it this way:

: extract-info ( hsh -- str str ) dup get-user swap get-text ;


Trying it out in the listener:
 

--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
IN: scratchpad extract-info

--- Data stack:
"lukego"
"RT @debasishg: Huge tribute to @slava_pestov  by the man hims..."
IN: scratchpad


This works well. But it's a bit clumsy, shuffling the stack around manually.

 

Combinators and cleave combinators

Factor has a lot of combinator words and one of the subsets of combinators are cleave combinators.
I'm not going to digress a lot about combinators, that's a whole post on its own, but I will explain this one: bi

Bi takes two quotations and applies them both to the argument. A quotation is a literal piece of code contained between square brackets. This code is pushed onto the stack without being executed. The bi word finds these quotations and calls them, setting the stack correctly for the operations it needs to do.


With bi, we could write the previous code this way:

: extract-info ( hsh -- str str ) [ get-user ] [ get-text ] bi ;


We wouldn't even need the specific get-user and get-text words, since we'll only use them once anyway, and write it like this:

: extract-info ( hsh -- str str ) [ "from_user" get-value ] [ "text" get-value ] bi ;


The info is on the stack now, let's create a word that pretty prints that tweet.

We'll use the printf word to do this, it resembles the other printfs you know quite well.

Don't forget to add to USING: formatting and USE: formatting in the listener if you want to test this.

: pretty-print ( str str -- ) "<%s> %s" printf ;


Testing it out:

--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
"lukego"
"RT @debasishg: Huge tribute to @slava_pestov  by the man hims..."
IN: scratchpad pretty-print
<lukego> RT @debasishg: Huge tribute to @slava_pestov  by the man himself .. Manfred Von Thun .. http://t.co/6azLmJ2r
--- Data stack:
H{ { "geo" json-null } { "from_user_id_str" "362022429" } { ...
IN: scratchpad

 

Sequences

We're nearly there, the last thing we need to do now is to do this for all the results.
Factor has an 'each' word in the sequences vocab. It takes a sequence and a quotation and applies the quotation to every element in the sequence. We can use that to do the pretty-print on every tweet.

Let's try this in the listener: (restarting from the top so we have all the tweets)


IN: scratchpad "concatenative" url get-http-data json>

--- Data stack:
H{ { "page" 1 } { "results_per_page" 15 } { "max_id_str"...

IN: scratchpad  "results" get-value

--- Data stack:
{ H{ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~ ~array~...


IN: scratchpad [ extract-info pretty-print ] each
<lukego> RT @debasishg: Huge tribute to @slava_pestov  by the man himself .. Manfred Von Thun .. http://t.co/6azLmJ2r
<old_sound> RT @debasishg: Huge tribute to @slava_pestov  by the man himself .. Manfred Von Thun .. http://t.co/6azLmJ2r
<Hyperglot> Non concatenative morphology in English and ESL teaching:

Non concatenative morphology in English and ESL te... http://t.co/PGwp9xRI3G
<gclaramunt> RT @psnively: Stumbled across this old but great "Essence of Concatenative Languages" post. Lots to chew on! http://t.co/gqX4DtWr9x
<psnively> Stumbled across this old but great "Essence of Concatenative Languages" post. Lots to chew on! http://t.co/gqX4DtWr9x
<ShiningRay> RT @fogus: @stevej His tribute to @slava_pestov was a tear-jerker; he passed soon after.  http://t.co/33GF9ETz
IN: scratchpad

Works like a charm!

Let's now create one word that will do the searching for us:

: search ( str -- )
    url get-http-data json> "results" get-value
    [ extract-info pretty-print ] each ;

Play with that in the listener a bit ;-)

The total source file now looks like this:
 

! Copyright (C) 2013 Your name.
! See http://factorcode.org/license.txt for BSD license.
USING: assocs http.client json.reader kernel sequences formatting ;
IN: twittersearch

CONSTANT: fixed-url-part "http://search.twitter.com/search.json?q="

: get-http-data ( url -- data ) http-get nip ;

: url ( str -- str ) fixed-url-part swap append ;

: get-value ( hsh str -- str ) swap at* drop ;

: extract-info ( hsh -- str str ) [ "from_user" get-value ] [ "text" get-value ] bi ;

: pretty-print ( str str -- ) "<%s> %s\n" printf ;

: search ( str -- )
    url get-http-data json> "results" get-value
    [ extract-info pretty-print ] each ;

 

Conclusion

That's not a lot of code for what we are doing here if you ask me. Factor can be very terse for a lot of problems. You just need to learn how to use the concatenative paradigm a bit, which will come with practise.

 

There are a couple of possible improvements for this program:

Do keep in mind that I'm still learning Factor myself but I've already crossed the first bridge of writing a succesful little program. If you find errors or know of improvements, please share them in the comments on the forum.

 

My Latest Tweets
There was a problem retrieving the twitter updates.