I finally jumped on the NoSQL bandwagon and gave Redis a try.
I’ve been hearing about NoSQL for quite some time as a lightweight but much faster database system (the speed and ease being the advantage over RDBMS, with the disadvantage being the lack of relations). One of the several NoSQL systems is Redis, which I read about recently.
rs = redis.Redis('localhost') # or the host address
rs.zcard('en:1gms') # will return the cardinality of the ordered set 'en:1gms'
rs.zscore('en:1gms', 'hello') # will return the count for hello
Of course it doesn’t do smoothing, but I read about Redis and I was excited to try it. I can now use Redis for anything that has large data and needs fast look-ups =D
Obviously NoSQL is just a flat dictionary, but I am sure Redis is using some efficient mechanism for storing the data. It is written in ANSI C, so it only makes sense. Besides, Redis comes with ‘batteries included’ in that it has the server which, once running, can serve any client; a persistent data store, which comes back alive even if the server is stopped and restarted; the values don’t have to be strings but can be more complex data structures themselves, and finally, it has clients in a number of languages.
Redis provides quite a few benefits over a classical RDBMS, as enumerated here – for example better, more efficient data structures. However, a caveat is that once the data becomes greater than the memory and the system starts paging, the performance degrades radically. So perhaps this method is not the silver bullet for looking up all n-grams instantaneously if you don’t have enough memory. But it can still be useful (and it’s easy to use, and it comes with batteries included as mentioned earlier) for several other scenarios.
Go learn yourself some Redis 🙂