rainier ababao reads and writes

Tutorials category

Crazy Python one-liners

Motivation

In this post, we will explore a string manipulation task with the constraint that the solution is to be in as few lines as possible. Why? Just for fun. I would never commit an unclear, maybe inefficient one-liner into production code just to show off, and you shouldn’t either. However, I think this and similar constraint puzzles (in the category of what is called “code golf”) are really fun mental exercises in abstraction, problem decomposition, and mastering a language’s esoteric features.

The Slack for one of the student organizations I’m in has a channel called #politics. The channel’s pinned post is an ever-growing URL to a visualization of each channel member’s political views on the two axes of the political spectrum created by this website: “Libertarian-Authoritarian” and “Left-Right”.

The following is a sample URL that’s pinned in our channel (names replaced with the Greek alphabet):

https://www.politicalcompass.org/crowdchart?alpha=-3.38%2C-5.38&beta=-2.75%2C-4.62&gamma=-4.63%2C-4.15&delta=-4.75%2C-4.97&epsilon=-9.0%2C-8.31&zeta=-7.38%2C-4.36&eta=-3.5%2C-3.6&theta=-4.38%2C-2.92&iota=-7.13%2C-6.0&kappa=1.63%2C-2.36&lambda=-5.75%2C-6.51&mu=-3.25%2C-3.64&nu=1.75%2C-2.15&xi=-3.75%2C-5.08&name=omicron&ec=-4.88&soc=-2.05

And a sample image:

image

(Not surprising, considering that I go to the University of Texas). It would be nice to get a JSON encoding of this information so we might be able to integrate it into dataviz tools for practice.

Problem

Given an URL, such as:

https://www.politicalcompass.org/charts/crowdchart?alpha=-3.38%2C-5.38&beta=-2.75%2C-4.62&gamma=-4.63%2C-4.15&name=omicron&ec=-4.88&soc=-2.05

write a function to deserialize it into a JSON-like encoding, such as:

[
    {
        "alpha": {
            "economic": "-3.38",
            "social": "-5.38"
        }
    },
    {
        "beta": {
            "economic": "-2.75",
            "social": "-4.62"
        }
    },
    {
        "gamma": {
            "economic": "-4.63",
            "social": "-4.15"
        }
    },
    {
        "omicron": {
            "economic": "-4.88",
            "social": "-2.05"
        }
    }
]

using as few lines as possible.

Decomposition

We are mainly interested in the names of people and their econo-/socio- political preference scores, which are delimited by various characters. In a string s = 'https://www.politicalcompass.org/charts/crowdchart?alpha=-3.38%2C-5.38&beta=-2.75%2C-4.62&gamma=-4.63%2C-4.15&name=omicron&ec=-4.88&soc=-2.05', we can split it by ? to return a list object:

In [1]: s = 'https://www.politicalcompass.org/charts/crowdchart?alpha=-3.38%2C-5.38&beta=-2.75%2C-4.62&gamma=-4.63%2C-4.15&name=omicron&ec=-4.88&soc=-2.05'

In [2]: s.split('?')
Out[2]:
['https://www.politicalcompass.org/charts/crowdchart',
 'alpha=-3.38%2C-5.38&beta=-2.75%2C-4.62&gamma=-4.63%2C-4.15&name=omicron&ec=-4.88&soc=-2.05']

Now we can get the string we’re interested in using the expression s.split('?')[1]. In length-constrained coding, we want to minimize the number of new variable assignments, and this language’s AST lets us do this. Since s.split('?')[1] returns our string of interest, we can call .split('&') on that expression to obtain a list of strings:

In [3]: s.split('?')[1].split('&')
Out[3]:
['alpha=-3.38%2C-5.38',
 'beta=-2.75%2C-4.62',
 'gamma=-4.63%2C-4.15',
 'name=omicron',
 'ec=-4.88',
 'soc=-2.05']

We are getting closer to the form we would like, but there is some cleaning up to do with special cases at the end. As soon as a person (let’s say their name is omicron) adds their own name and scores to the list, the new HTTP GET parameters remain encoded in the URL (after the third-to-the-last &). (The endpoint handler can handle URLs without the last three GET parameters, but we don’t want the client to have to conform their parameters to the previous parameters by hand). Let’s make an expression to select the last 3 elements in the list to transform it into one element that conforms to the rest of the list:

In [4]: s.split('?')[1].split('&')[-3:]
Out[4]: ['name=omicron', 'ec=-4.88', 'soc=-2.05']

But we’re still only interested in the string parts after the =, and want to make a sequence of those to combine later. If we only focus on one string name=omicron, we can see that we want to call split('=') and grab only second element of its return, i.e., 'name=omicron'.split('=')[1]. But what if we wanted to apply this function to a list of strings?

Enter the lambda syntax. With lambda, we can convert a function into a one-liner.

lambda x: x.split('=')[1]

We can even assign it to a variable and call it like this:

In [5]: my_lambda = lambda x: x.split('=')[1]

In [6]: my_lambda('foo=bar')
Out[6]: 'bar'

Also note that it looks like a mathematical function, such as f(x). But like I said earlier, we’ll keep assigning expressions to variables to the minimum. It’s important to understand that the lambda definition always contains an expression (after the :) that is returned upon evaluation.

So back to the last question - how do we apply this function to a list of strings that have = in them?

Enter the map syntax. map(func, seq) applies function func to every element in sequence seq, and it always returns a sequence (such as a list or tuple) such that each element is the return of the function applied to that element. It’s probably easier to show you:

In [7]: map(my_lambda, ['foo=bar', 'baz=qux', 'ham=spam', 'eggs=bacon'])
Out[7]: ['bar', 'qux', 'spam', 'bacon']

Let’s use map and lambda to compose a list of the strings we were interested in, building on the list expression we composed earlier in [4]:

In [8]: map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:])
Out[8]: ['omicron', '-4.88', '-2.05']

Now use this list as arguments and the string '{}={}%2C{}' as the caller to the function str.format(*args) to get a string that conforms to the other strings (e.g., alpha=-3.38%2C-5.38) in the original list:

In [9]: '{}={}%2C{}'.format(*map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:]))
Out[9]: 'omicron=-4.88%2C-2.05'

(* is the “splat” operator, which converts a Python list into positional arguments in a function call.)

Before we append this string to the original list, let’s get all but the last 3 elements from the original list in [3]:

In [10]: s.split('?')[1].split('&')[:-3]
Out[10]: ['alpha=-3.38%2C-5.38', 'beta=-2.75%2C-4.62', 'gamma=-4.63%2C-4.15']

We can wrap the expression from [9] that returns the new string in brackets to convert it into a single-item list, and append it to the original list using the + operator:

In [11]: s.split('?')[1].split('&')[:-3] + ['{}={}%2C{}'.format(*map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:]))]
Out[11]:
['alpha=-3.38%2C-5.38',
 'beta=-2.75%2C-4.62',
 'gamma=-4.63%2C-4.15',
 'omicron=-4.88%2C-2.05']

We need to break up each string by = and %2C now. Hello, map and lambda again:

In [12]: map(lambda x: [x.split('=')[0]] + x.split('=')[1].split('%2C'), s.split('?')[1].split('&')[:-3] + ['{}={}%2C{}'.format(*map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:]))])
Out[12]:
[['alpha', '-3.38', '-5.38'],
 ['beta', '-2.75', '-4.62'],
 ['gamma', '-4.63', '-4.15'],
 ['omicron', '-4.88', '-2.05']]

And the final map/lambda combo:

In [13]: convert_to_dictlist = lambda s: map(lambda x: {x[0] : {'economic' : x[1], 'social' : x[2]}}, map(lambda x: [x.split('=')[0]] + x.split('=')[1].split('%2C'), s.split('?')[1].split('&')[:-3] + ['{}={}%2C{}'.format(*map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:]))]))

Let’s try it:

In [14]: convert_to_dictlist(s)
Out[14]:
[{'alpha': {'economic': '-3.38', 'social': '-5.38'}},
 {'beta': {'economic': '-2.75', 'social': '-4.62'}},
 {'gamma': {'economic': '-4.63', 'social': '-4.15'}},
 {'omicron': {'economic': '-4.88', 'social': '-2.05'}}]

It worked, and we defined the function in one line.

We can use print json.dumps() to print it out in a nicer looking format.

In [15]: import json

In [16]: print json.dumps(convert_to_dictlist(s), indent=4)
[
    {
        "alpha": {
            "economic": "-3.38",
            "social": "-5.38"
        }
    },
    {
        "beta": {
            "economic": "-2.75",
            "social": "-4.62"
        }
    },
    {
        "gamma": {
            "economic": "-4.63",
            "social": "-4.15"
        }
    },
    {
        "omicron": {
            "economic": "-4.88",
            "social": "-2.05"
        }
    }
]

Nice!!

The answer

Disclaimer

I didn’t get this one-liner immediately off the top of my head and on the first try. I use a fantastic REPL/interactive shell called IPython (where the In/Out lines come from) that makes it easy to quickly iterate (no pun intended) on composing Python code. There, I decomposed the problem, made dozens of intermediate variable assignments to heavily abstract, and then substituted the variables with the original statements and expressions they were assigned to.

The one-liner

lambda s: map(lambda x: {x[0] : {'economic' : x[1], 'social' : x[2]}}, map(lambda x: [x.split('=')[0]] + x.split('=')[1].split('%2C'), s.split('?')[1].split('&')[:-3] + ['{}={}%2C{}'.format(*map(lambda x: x.split('=')[1], s.split('?')[1].split('&')[-3:]))]))

Expanded for “clarity”

lambda s:
  map(
    lambda x:
      { x[0]: { 'economic' : x[1], 'social' : x[2] } }
      map(
        lambda x:
          [x.split('=')[0]] + x.split('=')[1].split('%2C'),
            s.split('?')[1].split('&')[:-3] + ['{}={}%2C{}'.format(*
            map(
              lambda x:
              x.split('=')[1],
                s.split('?')[1].split('&')[-3:]
            )
            )
          ]
      )
  }

Resources

Some links about map and lambda (kinda Python specific) since my explanation didn’t do it justice:

A pretty approachable introduction to lambda expressions in Python that demonstrates their usefulness in production code, scoping rules, and addresses common points of confusion: Yet Another Lambda Tutorial

Quora: Is it better to use list comprehensions or map with a lambda function when manipulating elements in a list?

Does your favorite programming language have syntax for map, filter, reduce, and/or lambda expressions?

Optimizing your iteration speed

One Quora answer that’s really stuck with me for the past few months is Edmond Lau’s answer to “What are the things required to become a hardcore programmer?” The entire answer is solid, but this is the part I will focus on:

4. Optimize your iteration speed.

Say it takes you 12 seconds to search and navigate from a function call to where it’s defined. And suppose you have to do similar navigations 60 times per day. That’s 12 minutes per day spent navigating files. If you learn enough keyboard shortcuts in your text editor to navigate to a file in 2 seconds instead of 12, then over the course of a day, you’d save 10 minutes. Every year, you’d save an entire 40-hour work week.

Find 3 more similar optimizations, and you’d save a month. Imagine what you could do with an extra month per year.

Or suppose that you’re fixing a bug that requires you to start the app and then navigate through five screens to set up the right conditions to trigger the bug. Could you spend 10 minutes to wire it up so that it goes to the buggy screen on startup? You’d be able to iterate significantly faster on that bug with a small initial investment.

Systematically ask yourself, “How can I iterate faster?” The payoffs will be enormous.

One of the payoffs, not explicitly mentioned, is that decreasing your iteration time will lower your chances of emptying your mental “cache” or “stack” of things to do, i.e., getting distracted, while you’re spending time on minor, unnecessary context switches.

I’m a student currently at my second software internship. As someone who’s been developing stuff for just a little under two years, I’m not the fastest, but I’ve still managed to accumulate a list of small, medium, and large techniques that speed up my development. I feel like adopting every technique possible (like learning how to use the Dvorak keyboard, for me) would be impractical, so these are the ones that keep me fast but sane. It also goes without saying that different tricks will work for different people, so I hope my tips are general enough for most budding developers.

(By the way, most of these work best if you work in a Unix environment. Most of my work these days is done on OS X and when I’m ssh’d into Linux boxes.)

Small things

Select all/copy/paste (well-known)

[Cmd-a]/[Cmd-c]/[Cmd-v]

Quickly navigate and select text

Using keyboard shortcuts such as [Cmd-Left/Right]+[Shift-Cmd-Right/Left] to highlight a line. Experiment with these!

Get to the beginning of a line in the shell

[Ctrl-a]

New tab/Close tab in most programs that have tabs

[Cmd-t]/[Cmd-w]

Issue the last command in sudo mode

sudo !!

At first, I thought it was a funny alias my mentor made, as in, “sudo, darn it!”. But !! (read: “bang-bang“) is a great thing to know!!

Launch Sublime Text from the command line

If I’m not using Vim or MacDown, I open up Sublime like so:

subl README.md

You can either add an alias to the Sublime app in your shell profile or add a symlink to it in one of the /bin folders of your PATH.

More: Stack Overflow link

Take advantage of Spotlight (or Alfred)

It takes me less than a second to launch the most-used applications on my desktop. To do that, all I have to do is type [Cmd-Space] anywhere and the first few letters, if not just the first letter, of the application.

Medium things

Use oh-my-zsh

Oh My Zsh is an easy-to-install framework for installing and managing zsh, an alternative shell to bash. The most useful feature for me is its fast autocompletion and ability to simply tab through autocompletion results. It also makes your shell look fancy :)

More: Difference between bash and zsh

Use a good .vimrc if you use Vim

I’m pretty lazy and use the one by “amix” out of the box. If I cared more to study Vimscript, I could roll my own and be an 3l33t h4x0r, but this will do.

More: The ultimate vimrc

Add stuff to your .bash_profile/.bashrc/.zshrc

alias workonproj="ssh -t rainier.ababao@xyz.example.com 'cd ~/project_directory; bash'"

It’s quite nice to open up the shell and type in a short alias to ssh into a server.

export GITPAT="xyz"

If I have GitHub two-factor auth enabled, I’m required to use a “personal access token” to authenticate pushes if I cloned by HTTPS instead of SSH. In that case, I can just echo $GITPAT since it’s an environment variable in my private shell profile, and copy-paste it whenever I have to authenticate. However, I stopped doing this when I learned about triggers for my terminal…

More: Learning the Shell

Take advantage of iTerm2

For command line work, I use iTerm2 instead of Terminal. There are a lot of great things about it, and one of them is triggers. A trigger is a regex that iTerm2 listens for, followed by an action, like automatically entering text as if the user had typed it in themselves. I have a trigger set up so that when \[sudo\] password for rainier.ababao appears, the password manager opens and I just press [Enter] for iTerm2 to put in my password.

More: iTerm2 triggers

Large things

Improve your words per minute (WPM)

Learn to keep your fingers on the home row and touch-type! It will be slow at first, but you’ll see gains from 40 WPM to 100-120 WPM within a few months if you practice every day. That’s at least doubling one’s typing speed!

(Edit: This isn’t to suggest that efficient coding has much to do with how fast you can type, but rather that this reduces the repetitive task of looking for unnaturally-placed characters on the keyboard.)

Some fun: play.typeracer.com

Master a text editor

I started forcing myself to use Vim last year and have since gotten faster editing with it than with any other text editor. Vim comes installed with vimtutor, a classic interactive tutorial for the editor.

The common thread in a lot of these micro-optimizations is keeping your fingers on the keyboard and away from the mouse.

What are your favorite optimizations? Feel free to DM or tweet @rainieratx with corrections and suggestions.

LaTeX in a Makefile

In my parallel computing class this semester, we have to write reports with timing results and some analysis using LaTeX. Since we already cook up Makefiles to compile our MPI C programs, why don’t we construct our reports the same way too?

Here is a Makefile I wrote to compile my programs and construct the PDF for my report:

LATEXTMP=*.aux *.log *.toc
EXECS=ex6 ex7
MPICC?=mpicc
LATEXCMP=pdflatex

all: ${EXECS}

ex6: ex6.c
    ${MPICC} -o ex6 ex6.c

ex7: ex7.c
    ${MPICC} -o ex7 ex7.c

report: report.tex
    ${LATEXCMP} report.tex

clean:
    rm ${EXECS} ${LATEXTMP}

If you are familiar with Make, this set of rules should be fairly straightforward to follow. I also find this useful for getting rid of the temporary files that pdflatex generates, without say, fat-fingering a regex upon rm.

If this is new to you, the GNU Make Manual is a well-written tutorial to start.

Make Make great again! 😛

What kind of automation tools do you use for tasks that aren’t typically automated?

Binary search implementation in Erlang - what’s going on?

Let’s hack on some Erlang

Erlang was originally built in the late 80s to handle telecom switches. As such, its major features include primitives for concurrency and error recovery, which led to it making its way into applications requiring fault-tolerance, scalability, and maximum availability. I picked it up to learn more about concurrent programming and because it seemed hip.

Plus, here’s an awkwardly-funny video about one of its original uses in telephony:

Setup

If you’re on OSX and have Homebrew, you can install Erlang with brew install Erlang.

Now cd into a directory where you want to keep your file. To make things simpler, you’ll want to save your program and run the Erlang REPL from here, too. vim binary_search.erl, and let’s begin!

Writing the header

-module(binary_search).
-export([binary_search/2]).

What’s going on?

  • A module is just a bunch of functions regrouped into a single file, and has to be the first statement of the file.
  • We can use the name binary_search to call functions from this module into in other modules, in the form M:F(A), where M is the module name, F is the function name, and A is the arguments.
  • -export([binary_search/2]) defines which functions of the module can be called by the outside world, it describes the public interface of the module. It’s similar to how encapsulation in OOP restricts an object’s externally-usable components through the notions of public and private methods.
  • The export attribute has the format -export([Function1/Arity, Function2/Arity, ..., FunctionN/Arity]). The arity of a function or operator is the number of arguments or operands it takes. Thusly, binary_search has an arity of 2.

Guarding against incorrect types

The parameters of a binary search should be some comparable target value and a sorted list.

binary_search(X, List)
    when
        is_number(X),
        is_list(List) ->
    bs(X, lists:sort(List)).

What’s going on?

  • is_number and is_list are built-in functions of Erlang, and they do exactly what you’d expect. They are part of the erlang module, which is automatically imported whenever you use Erlang.
  • But every other function will have to be called in the M:F(A) format described in the last section (M - module name, F - function name, A - arguments)
    • In fact, this is how you will call every function you use in Erlang besides those in the erlang module.
    • You know how in Python, you can just from time import sleep, or in C, you #include <stdio.h> to add function names into the current namespace? Well, Erlang has a very flat namespace. This Stack Overflow link has a good explanation of how it works.
  • when is called a “guard” type of clause, and it tells us to perform the action following the -> if all statements between when and it are true.
  • Every function ends with a period. Whatever is the last expression of a function to be executed will have its value returned to the caller without it being mentioned.

The algorithm

The classic binary search algorithm. We will implement it recursively.

So, here’s how it looks in Erlang:

bs(_, [])   -> false;
bs(X, [_])  -> false;
bs(X, [X])  -> true;
bs(X, List) ->
    ok = io:format("bs(~p, ~p)~n", [X, List]),
    Length = length(List),
    Middle = (Length + 1) div 2,
    Value = lists:nth(Middle, List),
    if
        X == Value -> true;
        X > Value -> bs(X, lists:sublist(List, Middle + 1, Length));
        X < Value -> bs(X, lists:sublist(List, 1, Middle))
    end.

What’s going on?

  • The _ is the “don’t care” variable, which is a common thing to use in many languages (for instance, whenever you’re iterating and not using the counter, e.g. for _ in range(0, 10)). This also happens to silence a “unused variable” warning made by the Erlang compiler.
    • We see these in use for two of the base cases, and their placement makes sense if you think about it. Binary search of an empty list, list containing useless information, or target variable containing useless information should never be true.
  • We see use of pattern-matching. Instead of
function(Args)
   if X then
      Expression
   else if Y then
      Expression
   else
      Expression

We write:

function(X) ->
  Expression;
function(Y) ->
  Expression;
function(_) ->
  Expression.

The final code

Your file, binary_search.erl, should look something like this at the end:

-module(binary_search).
-export([binary_search/2]).

binary_search(X, List)
    when
        is_number(X),
        is_list(List) ->
    bs(X, lists:sort(List)).

bs(_, [])   -> false;
bs(X, [X])  -> true;
bs(X, [_])  -> false;
bs(X, List) ->
    ok = io:format("bs(~p, ~p)~n", [X, List]),
    Length = length(List),
    Middle = (Length + 1) div 2,
    Value = lists:nth(Middle, List),
    if
        X == Value -> true;
        X > Value -> bs(X, lists:sublist(List, Middle + 1, Length));
        X < Value -> bs(X, lists:sublist(List, 1, Middle))
    end.

Putting it to use

$ erl
Erlang/OTP 18 [erts-7.2.1] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Eshell V7.2.1  (abort with ^G)
1> c(binary_search).
{ok,binary_search}
2> binary_search:binary_search(2, [1, 2, 3, 4, 5]).
bs(2, [1,2,3,4,5])
bs(2, [1,2,3])
true

Cool! I know I left out a lot of stuff - mostly because I don’t want to rewrite all of Learn You Some Erlang for Great Good!

And I really hope this piques your interest to learn it more formally. If you want to learn more about Erlang, I highly recommend reading some of the links below.

Sources

[1] http://www.erlang.org/course/history.html
[2] http://veldstra.org/whyerlang/
[3] http://learnyousomeerlang.com/