I’ve participated in about four sets of Google interviews (of about 3 interviews each) for various positions. I’m still not a Googler though, which I guess indicates that I’m not the best person to give this advice. However, I think it’s about time I put in my 0.002372757892 bitcoins. I recently did exactly this to [...]
FM-Indexes and Backwards Search
Last time (way back in June! I have got to start blogging consistently again) I discussed a gorgeous data structure called the Wavelet Tree. When a Wavelet Tree is stored using RRR sequences, it can answer rank and select operations in $\mathcal{O}(\log{A})$ time, where A is the size of the alphabet. If the size of [...]
Wavelet Trees – an Introduction
Today I will talk about an elegant way of answering rank queries on sequences over larger alphabets – a structure called the Wavelet Tree. In my last post I introduced a data structure called RRR, which is used to quickly answer rank queries on binary sequences, and provide implicit compression. A Wavelet Tree organises a [...]
RRR – A Succinct Rank/Select Index for Bit Vectors
This blog post will give an overview of a static bitsequence data structure known as RRR, which answers arbitrary length rank queries in $\mathcal{O}(1)$ time, and provides implicit compression. As my blog is informal, I give an introduction to this structure from a birds eye view. If you want, read my thesis for a version [...]
Generating Binary Permutations in Popcount Order
I’ve been keeping an eye on the search terms that land people at my site, and although I get the occasional “alex bowe: fact or fiction” and “alex bowe bad ass phd student” queries (the frequency strangely increased when I mentioned this on Twitter) I also get some queries that relate to the actual content. [...]
Some Lazy Fun with Streams
Update: fellow algorithms researcher Francisco Claude just posted a great article about using lazy evaluation to solve Tic Tac Toe games in Common Lisp. Niki (my brother) also wrote a post using generators with asynchronous prefetching to hide IO latency. Worth a read I say! I’ve recently been obsessing over this programming idea called streams (also known as infinite lists or [...]
Design Pattern Flash Cards
Last year I studied a subject which required me to memorise design patterns. I tried online flash card web sites, but I was irritated that I didn’t own the data I put up (they had no export option). So I wrote a something in Python to generate flash cards for me using LaTeX and the [...]
Au Naturale – an Introduction to NLTK
This blog post is an introduction on how to make a key phrase extractor in Python, using the Natural Language Toolkit (NLTK). But how will a search engine know what it is about? How will this document be indexed correctly? A human can read it and tell that it is about programming, but no search [...]
Advice to CS Undergrads
Since I’m starting my PhD this year, I have been reflecting on how I would be different if I went back in time and started my degree all over again. I am also continuing tutoring, in my 4th year, and I have been occasionally approached by students and asked for general advice with their studies. [...]
