<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>Death of a Gremmie</title>
    <link>http://deathofagremmie.com</link>
    <description>Brian Neal's blog about programming.</description>
    <pubDate>Sun, 13 May 2012 18:30:33 GMT</pubDate>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>Who's Online with Redis & Python, a slight return</title>
      <link>http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return</link>
      <pubDate>Sat, 17 Dec 2011 19:05:00 CST</pubDate>
      <category><![CDATA[Python]]></category>
      <category><![CDATA[Redis]]></category>
      <guid isPermaLink="true">http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return</guid>
      <description>Who's Online with Redis & Python, a slight return</description>
      <content:encoded><![CDATA[<div class="document">
<p>In a <a class="reference external" href="http://deathofagremmie.com/2011/04/25/a-better-who-s-online-with-redis-python/">previous post</a>, I blogged about building a &quot;Who's Online&quot; feature using
<a class="reference external" href="http://redis.io/">Redis</a> and <a class="reference external" href="http://www.python.org">Python</a> with <a class="reference external" href="https://github.com/andymccurdy/redis-py">redis-py</a>.  I've been integrating <a class="reference external" href="http://celeryproject.org">Celery</a> into my
website, and I stumbled across this old code. Since I made that post, I
discovered yet another cool feature in Redis: sorted sets. So here is an even
better way of implementing this feature using Redis sorted sets.</p>
<p>A sorted set in Redis is like a regular set, but each member has a numeric
score. When you add a member to a sorted set, you also specify the score for
that member. You can then retrieve set members if their score falls into a
certain range. You can also easily remove members outside a given score range.</p>
<p>For a &quot;Who's Online&quot; feature, we need a sorted set to represent the set
of all users online. Whenever we see a user, we insert that user into the set
along with the current time as their score. This is accomplished with the Redis
<a class="reference external" href="http://redis.io/commands/zadd">zadd</a> command.  If the user is already in the set, <a class="reference external" href="http://redis.io/commands/zadd">zadd</a> simply updates
their score with the current time.</p>
<p>To obtain the curret list of who's online, we use the <a class="reference external" href="http://redis.io/commands/zrangebyscore">zrangebyscore</a> command to
retrieve the list of users who's score (time) lies between, say, 15 minutes ago,
until now.</p>
<p>Periodically, we need to remove stale members from the set. This can be
accomplished by using the <a class="reference external" href="http://redis.io/commands/zremrangebyscore">zremrangebyscore</a> command. This command will remove
all members that have a score between minimum and maximum values. In this case,
we can use the beginning of time for the minimum, and 15 minutes ago for the
maximum.</p>
<p>That's really it in a nutshell. This is much simpler than my previous
solution which used two sets.</p>
<p>So let's look at some code. The first problem we need to solve is how to
convert a Python <tt class="docutils literal">datetime</tt> object into a score. This can be accomplished by
converting the <tt class="docutils literal">datetime</tt> into a POSIX timestamp integer, which is the number
of seconds from the UNIX epoch of January 1, 1970.</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">datetime</span>
<span class="kn">import</span> <span class="nn">time</span>

<span class="k">def</span> <span class="nf">to_timestamp</span><span class="p">(</span><span class="n">dt</span><span class="p">):</span>
    <span class="sd">&quot;&quot;&quot;</span>
<span class="sd">    Turn the supplied datetime object into a UNIX timestamp integer.</span>

<span class="sd">    &quot;&quot;&quot;</span>
    <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">time</span><span class="o">.</span><span class="n">mktime</span><span class="p">(</span><span class="n">dt</span><span class="o">.</span><span class="n">timetuple</span><span class="p">()))</span>
</pre></div>
<p>With that handy function, here are some examples of the operations described
above.</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">redis</span>

<span class="c"># Redis set keys:</span>
<span class="n">USER_SET_KEY</span> <span class="o">=</span> <span class="s">&quot;whos_online:users&quot;</span>

<span class="c"># the period over which we collect who&#39;s online stats:</span>
<span class="n">MAX_AGE</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">timedelta</span><span class="p">(</span><span class="n">minutes</span><span class="o">=</span><span class="mi">15</span><span class="p">)</span>

<span class="c"># obtain a connection to redis:</span>
<span class="n">conn</span> <span class="o">=</span> <span class="n">redis</span><span class="o">.</span><span class="n">StrictRedis</span><span class="p">()</span>

<span class="c"># add/update a user to the who&#39;s online set:</span>

<span class="n">username</span> <span class="o">=</span> <span class="s">&quot;sally&quot;</span>
<span class="n">ts</span> <span class="o">=</span> <span class="n">to_timestamp</span><span class="p">(</span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">())</span>
<span class="n">conn</span><span class="o">.</span><span class="n">zadd</span><span class="p">(</span><span class="n">USER_SET_KEY</span><span class="p">,</span> <span class="n">ts</span><span class="p">,</span> <span class="n">username</span><span class="p">)</span>

<span class="c"># retrieve the list of users who have been active in the last MAX_AGE minutes</span>

<span class="n">now</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">()</span>
<span class="nb">min</span> <span class="o">=</span> <span class="n">to_timestamp</span><span class="p">(</span><span class="n">now</span> <span class="o">-</span> <span class="n">MAX_AGE</span><span class="p">)</span>
<span class="nb">max</span> <span class="o">=</span> <span class="n">to_timestamp</span><span class="p">(</span><span class="n">now</span><span class="p">)</span>

<span class="n">whos_online</span> <span class="o">=</span> <span class="n">conn</span><span class="o">.</span><span class="n">zrangebyscore</span><span class="p">(</span><span class="n">USER_SET_KEY</span><span class="p">,</span> <span class="nb">min</span><span class="p">,</span> <span class="nb">max</span><span class="p">)</span>

<span class="c"># e.g. whos_online = [&#39;sally&#39;, &#39;harry&#39;, &#39;joe&#39;]</span>

<span class="c"># periodically remove stale members</span>

<span class="n">cutoff</span> <span class="o">=</span> <span class="n">to_timestamp</span><span class="p">(</span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">()</span> <span class="o">-</span> <span class="n">MAX_AGE</span><span class="p">)</span>
<span class="n">conn</span><span class="o">.</span><span class="n">zremrangebyscore</span><span class="p">(</span><span class="n">USER_SET_KEY</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">cutoff</span><span class="p">)</span>
</pre></div>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>A better "Who's Online" with Redis & Python</title>
      <link>http://deathofagremmie.com/2011/04/25/a-better-who-s-online-with-redis-python</link>
      <pubDate>Mon, 25 Apr 2011 12:00:00 CDT</pubDate>
      <category><![CDATA[Python]]></category>
      <category><![CDATA[Redis]]></category>
      <guid isPermaLink="true">http://deathofagremmie.com/2011/04/25/a-better-who-s-online-with-redis-python</guid>
      <description>A better "Who's Online" with Redis & Python</description>
      <content:encoded><![CDATA[<div class="document">
<p><strong>Updated on December 17, 2011:</strong> I found a better solution. Head on over to
the <a class="reference external" href="http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return/">new post</a> to check it out.</p>
<div class="section" id="who-s-what">
<h3>Who's What?</h3>
<p>My website, like many others, has a &quot;who's online&quot; feature. It displays the
names of authenticated users that have been seen over the course of the last ten
minutes or so. It may seem a minor feature at first, but I find it really does a lot to
&quot;humanize&quot; the site and make it seem more like a community gathering place.</p>
<p>My first implementation of this feature used the MySQL database to update a
per-user timestamp whenever a request from an authenticated user arrived.
Actually, this seemed excessive to me, so I used a strategy involving an &quot;online&quot;
cookie that has a five minute expiration time. Whenever I see an authenticated
user without the online cookie I update their timestamp and then hand them back
a cookie that will expire in five minutes. In this way I don't have to hit the
database on every single request.</p>
<p>This approach worked fine but it has some aspects that didn't sit right with me:</p>
<ul class="simple">
<li>It seems like overkill to use the database to store temporary, trivial information like
this. It doesn't feel like a good use of a full-featured relational database
management system (RDBMS).</li>
<li>I am writing to the database during a GET request. Ideally, all GET requests should
be idempotent. Of course if this is strictly followed, it would be
impossible to create a &quot;who's online&quot; feature in the first place. You'd have
to require the user to POST data periodically. However, writing to a RDBMS
during a GET request is something I feel guilty about and try to avoid when I
can.</li>
</ul>
</div>
<div class="section" id="redis">
<h3>Redis</h3>
<p>Enter <a class="reference external" href="http://redis.io/">Redis</a>. I discovered Redis recently, and it is pure, white-hot
awesomeness. What is Redis? It's one of those projects that gets slapped with
the &quot;NoSQL&quot; label. And while I'm still trying to figure that buzzword out, Redis makes
sense to me when described as a lightweight data structure server.
<a class="reference external" href="http://memcached.org/">Memcached</a> can store key-value pairs very fast, where the value is always a string.
Redis goes one step further and stores not only strings, but data
structures like lists, sets, and hashes. For a great overview of what Redis is
and what you can do with it, check out <a class="reference external" href="http://simonwillison.net/static/2010/redis-tutorial/">Simon Willison's Redis tutorial</a>.</p>
<p>Another reason why I like Redis is that it is easy to install and deploy.
It is straight C code without any dependencies. Thus you can build it from
source just about anywhere. Your Linux distro may have a package for it, but it
is just as easy to grab the latest tarball and build it yourself.</p>
<p>I've really come to appreciate Redis for being such a small and lightweight
tool. At the same time, it is very powerful and effective for filling those
tasks that a traditional RDBMS is not good at.</p>
<p>For working with Redis in Python, you'll need to grab Andy McCurdy's <a class="reference external" href="https://github.com/andymccurdy/redis-py">redis-py</a>
client library. It can be installed with a simple</p>
<div class="highlight"><pre><span class="nv">$ </span>sudo pip install redis
</pre></div>
</div>
<div class="section" id="who-s-online-with-redis">
<h3>Who's Online with Redis</h3>
<p>Now that we are going to use Redis, how do we implement a &quot;who's online&quot;
feature? The first step is to get familiar with the <a class="reference external" href="http://redis.io/commands">Redis API</a>.</p>
<p>One approach to the &quot;who's online&quot; problem is to add a user name to a set
whenever we see a request from that user. That's fine but how do we know when
they have stopped browsing the site? We have to periodically clean out the
set in order to time people out. A cron job, for example, could delete the
set every five minutes.</p>
<p>A small problem with deleting the set is that people will abruptly disappear
from the site every five minutes. In order to give more gradual behavior we
could utilize two sets, a &quot;current&quot; set and an &quot;old&quot; set. As users are seen, we
add their names to the current set. Every five minutes or so (season to taste),
we simply overwrite the old set with the contents of the current set, then clear
out the current set. At any given time, the set of who's online is the union
of these two sets.</p>
<p>This approach doesn't give exact results of course, but it is perfectly fine for my site.</p>
<p>Looking over the Redis API, we see that we'll be making use of the following
commands:</p>
<ul class="simple">
<li><a class="reference external" href="http://redis.io/commands/sadd">SADD</a> for adding members to the current set.</li>
<li><a class="reference external" href="http://redis.io/commands/rename">RENAME</a> for copying the current set to the old, as well as destroying the
current set all in one step.</li>
<li><a class="reference external" href="http://redis.io/commands/sunion">SUNION</a> for performing a union on the current and old sets to produce the set
of who's online.</li>
</ul>
<p>And that's it! With these three primitives we have everything we need. This is
because of the following useful Redis behaviors:</p>
<ul class="simple">
<li>Performing a <tt class="docutils literal">SADD</tt> against a set that doesn't exist creates the set and is
not an error.</li>
<li>Performing a <tt class="docutils literal">SUNION</tt> with sets that don't exist is fine; they are simply
treated as empty sets.</li>
</ul>
<p>The one caveat involves the <tt class="docutils literal">RENAME</tt> command. If the key you wish to rename
does not exist, the Python Redis client treats this as an error and an exception
is thrown.</p>
<p>Experimenting with algorithms and ideas is quite easy with Redis. You can either
use the Python Redis client in a Python interactive interpreter shell, or you can
use the command-line client that comes with Redis. Either way you can quickly
try out commands and refine your approach.</p>
</div>
<div class="section" id="implementation">
<h3>Implementation</h3>
<p>My website is powered by <a class="reference external" href="http://djangoproject.com">Django</a>, but I am not going to show any Django specific
code here. Instead I'll show just the pure Python parts, and hopefully you can
adapt it to whatever framework, if any, you are using.</p>
<p>I created a Python module to hold this functionality:
<tt class="docutils literal">whos_online.py</tt>. Throughout this module I use a lot of exception handling,
mainly because if the Redis server has crashed (or if I forgot to start it, say
in development) I don't want my website to be unusable. If Redis is unavailable,
I simply log an error and drive on. Note that in my limited experience Redis is
very stable and has not crashed on me once, but it is good to be defensive.</p>
<p>The first important function used throughout this module is a function to obtain
a connection to the Redis server:</p>
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">logging</span>
<span class="kn">import</span> <span class="nn">redis</span>

<span class="n">logger</span> <span class="o">=</span> <span class="n">logging</span><span class="o">.</span><span class="n">getLogger</span><span class="p">(</span><span class="n">__name__</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">_get_connection</span><span class="p">():</span>
    <span class="sd">&quot;&quot;&quot;</span>
<span class="sd">    Create and return a Redis connection. Returns None on failure.</span>
<span class="sd">    &quot;&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">conn</span> <span class="o">=</span> <span class="n">redis</span><span class="o">.</span><span class="n">Redis</span><span class="p">(</span><span class="n">host</span><span class="o">=</span><span class="n">HOST</span><span class="p">,</span> <span class="n">port</span><span class="o">=</span><span class="n">PORT</span><span class="p">,</span> <span class="n">db</span><span class="o">=</span><span class="n">DB</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">conn</span>
    <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">RedisError</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
        <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>

    <span class="k">return</span> <span class="bp">None</span>
</pre></div>
<p>The <tt class="docutils literal">HOST</tt>, <tt class="docutils literal">PORT</tt>, and <tt class="docutils literal">DB</tt> constants can come from a
configuration file or they could be module-level constants. In my case they are set in my
Django <tt class="docutils literal">settings.py</tt> file. Once we have this connection object, we are free to
use the Redis API exposed via the Python Redis client.</p>
<p>To update the current set whenever we see a user, I call this function:</p>
<div class="highlight"><pre><span class="c"># Redis key names:</span>
<span class="n">USER_CURRENT_KEY</span> <span class="o">=</span> <span class="s">&quot;wo_user_current&quot;</span>
<span class="n">USER_OLD_KEY</span> <span class="o">=</span> <span class="s">&quot;wo_user_old&quot;</span>

<span class="k">def</span> <span class="nf">report_user</span><span class="p">(</span><span class="n">username</span><span class="p">):</span>
 <span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> Call this function when a user has been seen. The username will be added to</span>
<span class="sd"> the current set.</span>
<span class="sd"> &quot;&quot;&quot;</span>
 <span class="n">conn</span> <span class="o">=</span> <span class="n">_get_connection</span><span class="p">()</span>
 <span class="k">if</span> <span class="n">conn</span><span class="p">:</span>
     <span class="k">try</span><span class="p">:</span>
         <span class="n">conn</span><span class="o">.</span><span class="n">sadd</span><span class="p">(</span><span class="n">USER_CURRENT_KEY</span><span class="p">,</span> <span class="n">username</span><span class="p">)</span>
     <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">RedisError</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
         <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</pre></div>
<p>If you are using Django, a good spot to call this function is from a piece
of <a class="reference external" href="http://docs.djangoproject.com/en/1.3/topics/http/middleware/">custom middleware</a>. I kept my &quot;5 minute cookie&quot; algorithm to avoid doing this on
every request although it is probably unnecessary on my low traffic site.</p>
<p>Periodically you need to &quot;age out&quot; the sets by destroying the old set, moving
the current set to the old set, and then emptying the current set.</p>
<div class="highlight"><pre><span class="k">def</span> <span class="nf">tick</span><span class="p">():</span>
    <span class="sd">&quot;&quot;&quot;</span>
<span class="sd">    Call this function to &quot;age out&quot; the old set by renaming the current set</span>
<span class="sd">    to the old.</span>
<span class="sd">    &quot;&quot;&quot;</span>
    <span class="n">conn</span> <span class="o">=</span> <span class="n">_get_connection</span><span class="p">()</span>
    <span class="k">if</span> <span class="n">conn</span><span class="p">:</span>
       <span class="c"># An exception may be raised if the current key doesn&#39;t exist; if that</span>
       <span class="c"># happens we have to delete the old set because no one is online.</span>
       <span class="k">try</span><span class="p">:</span>
           <span class="n">conn</span><span class="o">.</span><span class="n">rename</span><span class="p">(</span><span class="n">USER_CURRENT_KEY</span><span class="p">,</span> <span class="n">USER_OLD_KEY</span><span class="p">)</span>
       <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">ResponseError</span><span class="p">:</span>
           <span class="k">try</span><span class="p">:</span>
               <span class="k">del</span> <span class="n">conn</span><span class="p">[</span><span class="n">old</span><span class="p">]</span>
           <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">RedisError</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
               <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
       <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">RedisError</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
           <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
</pre></div>
<p>As mentioned previously, if no one is on your site, eventually your current set
will cease to exist as it is renamed and not populated further. If you attempt to
rename a non-existent key, the Python Redis client raises a <tt class="docutils literal">ResponseError</tt> exception.
If this occurs we just manually delete the old set. In a bit of Pythonic cleverness,
the Python Redis client supports the <tt class="docutils literal">del</tt> syntax to support this operation.</p>
<p>The <tt class="docutils literal">tick()</tt> function can be called periodically by a cron job, for example. If you are using Django,
you could create a <a class="reference external" href="http://docs.djangoproject.com/en/1.3/howto/custom-management-commands/">custom management command</a> that calls <tt class="docutils literal">tick()</tt> and schedule cron
to execute it. Alternatively, you could use something like <a class="reference external" href="http://celeryproject.org/">Celery</a> to schedule a
job to do the same. (As an aside, Redis can be used as a back-end for Celery, something that I hope
to explore in the near future).</p>
<p>Finally, you need a way to obtain the current &quot;who's online&quot; set, which again is
a union of the current and old sets.</p>
<div class="highlight"><pre><span class="k">def</span> <span class="nf">get_users_online</span><span class="p">():</span>
    <span class="sd">&quot;&quot;&quot;</span>
<span class="sd">    Returns a set of user names which is the union of the current and old</span>
<span class="sd">    sets.</span>
<span class="sd">    &quot;&quot;&quot;</span>
    <span class="n">conn</span> <span class="o">=</span> <span class="n">_get_connection</span><span class="p">()</span>
    <span class="k">if</span> <span class="n">conn</span><span class="p">:</span>
        <span class="k">try</span><span class="p">:</span>
            <span class="c"># Note that keys that do not exist are considered empty sets</span>
            <span class="k">return</span> <span class="n">conn</span><span class="o">.</span><span class="n">sunion</span><span class="p">([</span><span class="n">USER_CURRENT_KEY</span><span class="p">,</span> <span class="n">USER_OLD_KEY</span><span class="p">])</span>
        <span class="k">except</span> <span class="n">redis</span><span class="o">.</span><span class="n">RedisError</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
            <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>

    <span class="k">return</span> <span class="nb">set</span><span class="p">()</span>
</pre></div>
<p>In my Django application, I calling this function from a <a class="reference external" href="http://docs.djangoproject.com/en/1.3/howto/custom-template-tags/#inclusion-tags">custom inclusion template tag</a>
.</p>
</div>
<div class="section" id="conclusion">
<h3>Conclusion</h3>
<p>I hope this blog post gives you some idea of the usefulness of Redis. I expanded
on this example to also keep track of non-authenticated &quot;guest&quot; users. I simply added
another pair of sets to track IP addresses.</p>
<p>If you are like me, you are probably already thinking about shifting some functions that you
awkwardly jammed onto a traditional database to Redis and other &quot;NoSQL&quot;
technologies.</p>
</div>
</div>
]]></content:encoded>
    </item>
  </channel>
</rss>

