Best of 2016

As is the tradition of many technologists, I am writing about some of the best things that I discovered or did in 2016. This post is a little bit late, but I discovered a lot of great things in 2016 that I believe deserve to be shared.

Total books read: 50

Best non-technical books read
Michio Kaku – The Future of the Mind: It’s nice to know what science could bring to our lives. Dr Kaku is a visionary and while I find many of his predictions very ‘out there’, he is very comprehensive in his style and claims, which makes his books a delight to consume

Best technical books read
Kenneth Reitz & Tanya Schlusser – The Hitchhiker’s Guide to Python: Best Practices for Development. While not for the beginner, if you use any modicum of Python at your day job and have been churning out code in Python for a while, I highly recommend grabbing a copy. It walks you through some of the best practices as well as the best known and highly used libraries out there, which is a must have in your toolbox

Best technical videos consumed
Kent Beck’s ‘livestorm’ about convex and concave software projects

Best new technologies discovered and used
Docker: containerization is very hot right now
WebSocket: A technology for maintaining full duplex communication between peers
Nginx: A powerful and scalable Web server

Best new languages discovered and used
Lua: A lightweight but very powerful ‘glue’ language

Best new hardware acquired and used
Pok3r keyboard (it has clear keys; and I am not sure whether I like it better than my DAS keyboard with blue keys, but it’s still pretty nice), Steelcase Gesture (finally got onto the bandwagon of expensive ergonomic ‘programmer’ chairs), Amazon Echo

Best new apps discovered and used
Productive (iOS) – a nice way to build habits – you can select one of the built-in habits or add your own, and you can select icons, frequencies, etc. It’s also nice to be able to view stats on the individual habits to see how well you’ve been doing
CleanMyMac (macOS) – does a good job of cleaning up extraneous files left over by installers and the system itself

Best new fitness videos consumed
Fittest on Earth (Netflix) – an incredible, candid look at the Crossfit lifestyle

Best new workout technique discovered and used
Zottman curls – a comprehensive arms exercise that targets both the biceps and triceps

Best new nutritional supplements discovered and used
Creatine and magnesium. I can’t believe I’ve lived this long without these. Definitely game changers

Advertisements

My first foray into Google Go, and a URL shortener

I’ve been interested in Google’s Go for a while now. It feels like one of those new languages that are very promising and are here to stay. Certain developers I look up to in the community have also been talking about how Go is great for concurrency and a great language to work with in general.

So far I’m only done with An Introduction to Programming in Go (a short, free, online book), and halfway through Go by Example, but it was enough for me to start creating toy programs of my own. There’s a lot more to read and learn in Go – but I recommend both of the above resources as quick getting started guides for intermediate programmers.

I’ve also been interested in expanding my horizons a bit more than the day to day of my day job, so I was wanting to explore a new algorithm, say, e.g. a URL shortening algorithm. I found a good explanation on StackOverflow, so I decided to give it a go.

Here is the result. It’s a very primitive piece, to be sure, and it can be improved in a gazillion ways. I’ll be sure to check back as I learn more Go and more effective Go, and will be adding to both the functionality and cleanliness of the code. For now, I’ve crafted some functions at best, that try to create maps between digits and letters, and do some division and string concatenation, and that’s pretty much it for now. Maybe there’s a lot better way of creating the alphabet map than the way I’m doing it. But it does seem to get the job done for a given alphabet.

Finally, I’d like to say that I’ve really been liking Go so far. It feels like a cleaner C. It has some aspects of more modern programming languages too, e.g. interfaces. But it has no classes, only structs. It also seems to be a simpler language, e.g. it doesn’t confuse you with three to four different ways of constructing loops – there’s only for, nothing else. I like the panic and defer mechanisms, and I also like goroutines and channels. I’m not the most experienced person when it comes to concurrency, but so far it has been a pretty smooth ride for me, with no surprises (well, only pleasant ones). Go is also more powerful than a lot of other languages out there in terms of memory management – it combines the best of two worlds – it has garbage collection, but it also provides you pointers. A great language overall, I must say.

I’d just use Vim for everything whenever possible. Most of my Go adventures have also been in Vim – there are some plugins out there that make it easier for you to import stuff, but I haven’t tried any plugins so far apart from color coding. I’ve had a brief stint with the LiteIDE – it’s not bad – it does a fantastic job of performing Intellisense of sorts, but the visuals are not the greatest (I believe it’s Qt-based), nor the shortcuts or text-editing shortcuts impressive. I’d use it if I must, but I can stick with Vim otherwise.

I have a feeling that there’s going to be more to come from me in GoLang. Here’s something funny to keep the spirit up 🙂

GNU Parallel on OSX and Linux

GNU Parallel is a great utility to parallelize any computation through the command line. The book Data Science at the Command Line discusses, amongst several other things, how to use GNU Parallel to distribute your data over different machines. The toy example/ tutorial in the book makes three assumptions:

(1) all machines you are using are running Ubuntu or some variant of Linux

(2) you are using a bunch of Amazon EC2 instances to do your parallelization (and hence need to find out the IPs of all your instances in a non-straightforward way)

(3) you are using GNU paste that comes pre-installed on all Ubuntu systems

I am presenting a tutorial that works with the premise that

(1) you are primarily using OS X and might have some Ubuntu machines as some of your instances

(2) all your machines are local (as in connected through a LAN)

(3) you are using the OS X variant of paste (which has a nuance compared to the Ubuntu version)

Here is a walkthrough that basically replicates the toy example in the book, but highlights the differences you’ll need to incorporate in an OS X environment.

First, you can install GNU Parallel on OS X through Homebrew:

$ (sudo) brew install parallel

Next, create your instances file (named ‘instances’), and add the hostnames of your local machines as shown in the screenshot. In my case, Cadmius happens to be Ubuntu 14.04, and macusers-Macbook is running OS X Mavericks. The main machine through which I am parallelizing things is also running OS X Mavericks.

The instances file

The instances file

You do not need to have Parallel installed on the ‘slave’ machines, but you might want to in case you want finer control.

Also, in case you don’t want to repeatedly enter your SSH password when Parallel is talking to the slave machines (it uses SSH underneath), you might want to enable password-free login to your slaves.

Also notice how in this case I’ve put in the username along with the hostnames; another difference from the book which uses EC2 instances and doesn’t need different usernames for the different IPs.

Great, now we’re ready to test if everything went well. Run the following command on the master machine:

$ seq 1000 | parallel -N100 –pipe –slf instances “(hostname; wc -l) | paste -sd: -“

Testing GNU Parallel

Testing GNU Parallel

Notice the additional ‘-‘ after the arguments to paste. That is a necessity on OS X. The book doesn’t have it because you do not need it on Linux. Without it though, OS X will complain. With it, both OS X and Ubuntu seem happy (yet you can see the differences in the outputs from the two kinds of machines).

Apart from that difference, the command is copied from the book; it is basically generating a sequence of 1000 numbers, and distributing them to the slaves. The output shows the hostnames and the number of numbers passed over to the slaves. Notice the error message. I’ve mentioned how Parallel doesn’t need to be installed on all machines for basic usage. In this case I did install Parallel on the master and two slaves, but it seems Parallel doesn’t like the fact that macusers-Macbook is an older machine with a Core 2 Duo? Not sure about that. Cadmius happens to have 8 CPUs (and 32 cores).

Finally, run the following to sum the numbers in parallel and then sum the 10 sums on the host machine:

$ seq 1000 | parallel -N100 –pipe –slf instances “paste -sd+ – | bc” | paste -sd+ – | bc

This command is also from the book and will give you the sum – you basically just summed 100 numbers each separately on the slave machines through different parallel processes, and then summed the 10 sums locally. (Again, notice the additional ‘-‘ in the paste commands).

The purpose of this post has been to essentially highlight all the changes I had to make in order to successfully run the toy example from the book on a mixture of OS X and Linux machines all running locally. Thanks for reading.

Panoply: Console-based TODO list manager

That’s right, yet another todo list manager. I am calling it Panoply, for no other reason than the fact that I like that word. Well, it is a learning project. I wanted something that does not depend on having an Internet connection all the time (e.g. Wunderlist and Teux Deux) (it is possible that those awesome products have offline modes, but still), and something that is free (unlike Things). More importantly, I wanted to check the feasibility of Python for building, testing and maintaining a relatively larger project. It certainly doesn’t have the graphical finesse of the apps mentioned above, but if I ever decide to go in that direction with it, it will be yet another learning adventure. As it stands right now though, Panoply remains a CLI app.

It’s still very much in a pre-alpha stage, but I have something running and working that I can now keep tweaking and enhancing. I am trying to follow Test Driven Development in this project as faithfully as I can (though I guess I could be better). The goal is essentially to write a command line tool in Python that helps with deadlines and in managing personal tasks. It is supposed to be an app that automatically adjusts itself somehow, based on current deadlines, future deadlines, and past overdue deadlines, so that it is better than a plain text todo list. I am the sole developer, user experience designer, user interaction designer and the tester of all things for now. I am hoping that that would change at some point. The data model that Panoply is currently using consists of simple CSV files. This aspect of the project might very well change in the future if I feel the need for a better data structure.

My idea is to keep the scope minimal, and enhance it one feature at a time. As it stands right now, if you were to test the app as of the day of writing this blog entry, it lets you start a task collection, add a task to the collection with a user name, and let you save and load the entire list of tasks. I am currently also supporting the functionality to selectively check off items so that you can mark them as ‘done’ and they no longer show in the list when you view it. Last but not least, I support the ability to scan the list of tasks and tasks collections and prompt the user that they need to hustle if they have a task listed that has a deadline past the current date (termed ‘overdue’ in the Panoply universe).

I am having a lot of fun with this project, trying to hack on it for 10 minutes every other day. My full time job and other obligations don’t allow for more at this point, but Panoply will certainly grow with time.

Using Vim to semi-automatically ingest contents of files

The title is vague. The post has to do with the workflow I came up with and used for something today. Let me just walk you through a hypothetical scenario and how to solve it. The tutorial is *Nix specific.

Imagine you have a bunch of files under a directory, with some plain text content, and you want to ingest the contents of the files into a single file. That by itself is easy to do with a simple $ cat * > target_file. But how about if you want to actually have the names of the files along with the contents, so you know where the contents came from, and maybe you also want to edit some of the content out and perhaps also, you most likely want the files to be ingested in some sort of a temporal ordering. For the sake of this example let us assume you want the files sorted in the chronological order – you want the contents of the oldest file to appear up top.

Can’t think of a scenario? How about having a bunch of notes, each written once a week, with the dates being the names of the files – and you want to conflate the notes into a single, master-notes file for the whole year, and you want the contents in the order that they were created, with the notes created on the oldest date appearing first.

Here’s how you do it with the combination of a shell and Vim.

Step 1: $ ls -ltr > master_file.txt – this will put a list of the files in the chronological order into the master_file.txt file. Note that just the ‘t’ flag is for the reverse chronological order, and you need to revert that using ‘r’

Step 2: Open the master_file.txt in Vim

Step 3: Get rid of the stuff you do not want using visual blocks or whatever Vim magic you know in a few keystrokes

Step 4: Also get rid of the name ‘master_file.txt’. Now you have a raw list of all the files whose contents you want. and the files are listed in the chronological order

Step 5: Start recording a macro (say, qa)

Step 6: Place the cursor at the beginning of the first line. Press yy to copy the first file’s name

Step 7: Press o<Enter> to make some space

Step 8: Hit <Esc> followed by :r <Ctrl-r>” to paste the contents of the filename you just copied from the default register

Step 9: You might have to press <Del> once to get rid of the ^M character that gets pasted

Step 10: Hit <Enter> and the contents of the file will be in the buffer at the appropriate place

Step 11: Place the cursor at the beginning of the next file’s name, and stop recording the macro (q)

Step 12: Got 1000 files? 1000@a (or whatever register you recorded it in) is your friend!

And trust me – if you are a Vim user, it’s much faster and simpler than it seems here in the long description. If you are a newbie, trust me – it’s must faster, simpler and funner than it seems.

Peace.

Fixing pip and easy_install, and getting colorful output from unittest

I’ve noticed in a lot of talks/ videos that people in the Ruby universe that work with tools like RSpec for writing unit tests/ doing TDD often have a nice colorful output. Aesthetics aside, I like the idea of the visual feedback – with red indicating your errors or failures and green or blue indicating your passing tests. It makes it much easier to use the sense of color to get feedback instead of having to actually carefully read the text scrolling on the terminal. There has to be a reason, after all, why TDD often has the moniker red-green-refactor. Searching on this matter, I rather quickly came across pyrg, which seems like an excelled tool.

However, since I kind of overwrote my existing Python installation (OS X 10.6) with the version from Macports, my easy_install and pip were still pointing toward one of my older Python installations. Ergo, pyrg wouldn’t run. After researching for a while I came across this excellent post on Stack Overflow, and followed the instructions to uninstall my easy_install and then reinstall using the distribute package mentioned on there, followed by reinstalling pip.

This by itself didn’t do anything, and I had to make sure in my shell’s rc file that the Macport version of Python was in the path before other, regular places like /usr/local/bin, etc. That fixed it. Run with $ pyrg python_file or $ python python_file |& pyrg. Colors can be easily customized as well. Hooray for going from the colorless test run output on the left to a proper visual feedback on the right!

Colorful

Colorful with pyrg

Colorless

Colorless without pyrg

The Power of Eclipse

I am a Vim guy. I use it every day. I spend time learning things I can do with it. I’ve gotten good with it over the years. I’ve read books, I’ve followed blogs and videos, I script it to my liking using VimScript, and I play VimGolf. I mostly use Vim for smaller scripts and quick proof-of-concept programming, love it to bits, and learn new things all the time.

Yet I always believed there’s something definitely good about all these IDEs out there – there have to be tons of language-specific features that probably a text editor cannot cover.

Enter my new job. I am supposed to be using Java quite a bit now, so I decided to read through the Eclipse Pocket Guide, apart from some other books on JUnit and Java (irrelevant to this post).

Eclipse is fun! I’ve already learned about tons of features I didn’t even know existed! Here’s a quick summary of what I especially like:

[1] Your entire codebase is hyperlinked, and if you hover your mouse over your code fragments with the Ctrl or Cmd key pressed , those hyperlinks activate and navigating your codebases (including all libraries) is just one mouse click away

[2] Java scrapbook pages [likely a predecessor to Scala IDE Worksheets] – Open using File -> New -> Other -> Java -> Java Run/Debug -> Scrapbook page, type in an expression, select the text of the expression and hit Cmd+Shift+D or Ctrl+Shift+D, and voila – it is evaluated for you. Not exactly a REPL, but close

[3] Easy JUnit integration

[4] A solid debugger

[5] Several options in terms of perspectives and views. Several views available for different needs. Might even save personalized perspectives for future use

[6] Great refactoring tools

[7] Easy running shortcuts – Cmd+Option+X+J for Java, Cmd+Option+X+T for a JUnit test case, etc

I am sure there’s a lot more stuff – I’m having fun learning all that 🙂