Category Archives: dev

Finding the Oddity in a Sequence

After reading Ned Batchelder’s Iter-tools for puzzles: oddity post yesterday, my first thought was to use itertools.groupby(). That version is the fastest I could come up with (by quite a bit actually, especially for longer sequences), but it requires sorting the iterable first, which uses additional space and won’t work with infinite sequences.

My next thought was to use a set to keep track of seen elements, but that requires the keys to be hashable, so I scrapped that idea.

I figured using a list to keep track of seen elements wouldn’t be too bad if the seen list was never allowed to grow beyond two elements. After playing around with this version for a while, I finally came up with something on par with Ned’s version performance wise while meeting the following objectives:

  • Don’t require keys to be hashable
  • Don’t store the elements read from the iterable
  • Support infinite sequences

Some interesting things:

  • Rearranging the branches to use in instead of not in sped things up more than I would have thought
  • Initially, I was checking the lengths of the seen and common lists on every iteration, which slowed things down noticeably (which isn’t all that surprising really)
  • Setting the key function to the identity function lambda v: v is noticeably slower than checking to see if it’s set on every iteration (also not surprising)
  • Using sets/dicts for seen, common, and uncommon didn’t make a noticeable difference (I tried adding a hashable option and setting things up accordingly, but it wasn’t worth the additional complexity)

Here’s the code (the gist includes a docstring, tests, and a simplistic benchmark):

def oddity(iterable, key=None, first=False):
    seen = []
    common = []
    uncommon = []

    for item in iter(iterable):
        item_key = key(item) if key else item

        if item_key in seen:
            if item_key in common:
                if first and uncommon:
                    return item_key, uncommon[1]
            else:
                common.append(item_key)
                if len(common) == 2:
                    raise TooManyCommonValuesError

                if item_key in uncommon:
                    i = uncommon.index(item_key)
                    j = i + 2
                    uncommon[i:j] = []
        else:
            seen.append(item_key)
            if len(seen) == 3:
                raise TooManyDistinctValuesError
            uncommon.extend((item_key, item))

    if len(seen) == 0:
        raise EmptyError

    if len(common) == 0:
        raise NoCommonValueError

    if len(uncommon) == 0:
        uncommon_value = None
    else:
        uncommon_value = uncommon[1]

return common[0], uncommon_value

Release Early, Release Often

It’s an old saw, but I was wondering today why some projects don’t cut releases more often. The repo for a project may contain the bug fix you need, but it’s just sitting there on GitHub. I think it often just comes down to the fact that making releases is tedious.

You have to update the version number (perhaps in multiple places), update the change log (hopefully), merge your development branch into your release/master branch, create a tag, clean your dev environment, build a distributable package, upload that package, maybe upload some docs, push some commits, etc.

Doing all that manually isn’t much fun, so…

Write a script to do it for you.

Write it in Python or Bash or as a make target or whatever floats your boat. It’s a one-time cost that pays off big.

You can’t quite automate everything–like writing a (good) change log–but you can automate most of the process.

As an example, I wrote this release script for a project I started a month and half ago. I’ve already made 13 15 26 alpha releases because it’s so easy to do. Putting in an hour or two up front was well worth it.

If you’re feeling lazy, you can use something like zest.releaser (for Python projects). I’ve used it in the past and it’s been the inspiration for all the release scripts I’ve written since.

What I Hate About Python

During an interview a while back, I was asked to name some things I hate about Python. For some reason, I choked and couldn’t think of a good answer (I kind of wanted to blame the interview process, but that’s a rant for another time).

Maybe I’ve just been programming in Python for too long, and that’s why I couldn’t think of something (or maybe I’m just a massive Python fanboy). On the other hand, I’ve been programming in JavaScript (which I generally like) for about as long, and I can think of at least a few things right off the top of my head (mostly related to weak typing).

I did a search to see what other people don’t like about Python to get some inspiration, but I didn’t come across anything I truly hate.

Things That Don’t Bother Me

  • Significant whitespace. I love it.
  • Explicit self. I guess it would be “convenient” if I didn’t have to add self to every method signature, but I really don’t spend much time on that, and it takes about 1ns to type (in fact, my IDE fills it in for me). There are technical and stylistic considerations here, but the upshot for me is that it just doesn’t matter, and I actually like that all instance attribute access requires the self. prefix.
  • “Crippled” lambda. There are rare occasions where I want to define more complex anonymous functions, but there’s no loss of expressiveness from having to use a “regular” named function instead. Maybe multi-line anonymous functions that allow statements would lead to different/better ways of thinking about programs, but I’m not particularly convinced of that. (Aside: one thing I do hate relating to this is the conflation of lambdas and closures–normal functions are closures in the same way that lambdas are.)
  • Packaging. I don’t know why, but I’ve never had any problems with setuptools. There are some issues with the installation of eggs when using easy_install, but I think pip fixes them. I am glad that setuptools is now being actively developed again and the distribute fork is no longer necessary.
  • Performance, GIL, etc. I’ve used Python for some pretty serious data crunching (hello, multiprocessing) as well as for Web stuff. There are cases where something else might have been faster, but Python has almost never been too slow (caveat: for my use cases). Of course, Python isn’t suitable for some things, but for most of the things I need to do, it’s plenty fast enough.
  • len(), et al. I don’t have anything to say about this other than it’s a complete non-issue for me. Commentary about how this means Python isn’t purely object-oriented makes me a little cranky.

Things That Bug Me a Little Bit

  • The way super works in Python 2 is kind of annoying (being required to pass the class and self  in the super() call). This is fixed in Python 3, where you can just say super().method() in the common case.
  • Unicode vs bytes in Python 2. This is also fixed in Python 3 (some people have argued that it’s not, but I haven’t run into any issues with it yet (maybe it’s because I’m working on a Python 3 only project?)).
  • The implicit namespace package support added in Python 3.3 causes some trouble for my IDE (PyCharm), but I’m assuming this is a temporary problem. I also had some trouble using nose and py.test with namespace packages. Again, I assume (hope) this is only temporary.

Things That Bother Me a Little More

  • The Python 2/3 gap is a bit troublesome. Sometimes I think the perception that there’s a problem may be more of a problem, but I don’t maintain any major open source projects, so I’m not qualified to say much about this. Personally, I’ve really been enjoying Python 3, and I do think it offers some worthwhile advantages over Python 2.

Conclusion

There isn’t one. I left some things out intentionally (various quirks). I probably forgot some things too.

YUI TreeView with Ruby on Rails

Here’s some code I’m using to generate a dynamic tree view using an acts_as_tree model with slug and title fields, the TreeView widget from YUI, and a Rails helper. I chopped out some of the code for clarity, so all this does is create a menu with the titles from the model, but the basic idea is there to expand on.

Rails view/JavaScript

<script type="text/javascript">
  var page_tree;
  page_tree_init = function () {
    page_tree = new YAHOO.widget.TreeView('page_tree');
    var root = page_tree.getRoot();
    <% generate_page_nodes(@root) {} %>
    page_tree.draw();
  };
  page_tree_init();
</script>

Ruby

def generate_page_nodes(node, &block)
  parent = node.parent
  node_name = node.slug.gsub('-', '_')
  parent_node_name = parent.nil? ? 'root' : parent.slug.gsub('-', '_')
  js = <<-JS
    var #{node_name} = new YAHOO.widget.MenuNode('<span class="node_title">#{node.title}</span>',
                                                 #{parent_node_name});
  JS
  concat(js, block.binding)
  children = node.children
  children.each { |c| generate_page_nodes(c, &block) } unless children.empty?
end

Ruby on Rails… Revisited

Updated with links and a couple typo corrections.

Update: It wasn’t long before the project got too complex on the back end (SOAP blech) for my limited Ruby knowledge. I switched it back to Python/Pylons and never looked back. The Pylons => Rails migration was straightforward. I guess I could have pushed through with Ruby/Rails, but with deadlines looming, it made more sense for me to go with what I knew best. Being familiar Python and its ecosystem was far more pertinent than the deficiency of any particular library. There’s probably another blog post or two in here…

I’ve been working on a fairly big Web site project lately. My partner and I initially decided to use Django to build the site, mainly because I’m a Python “expert” and Django is (apparently) the #1 Python Web framework. We were also lured by the easy admin interface.

After trying to use Django and not really enjoying it, I tried switching to Pylons because I’ve had a good amount of experience with it in the building of byCycle.org. It’s gone through two fairly major releases since then, and so have a bunch of the libraries that tend to get used with it, like SQLAlchemy, Elixir, etc.

I was having a hard time with the Pylons docs, and so I ended screwing around with Grok (which actually looks fairly interesting) and even took a look at the Zope 3 site. I’m sure Zope is really awesome or whatever, but it might as well suck. Every time I look at that site, I’m just like “WTF! This shit has been around for like five years!” Anyway, I might just not be smart enough for Zope.

This led us back toward Rails (even if it is a ghetto). I used Rails a bit last year but never did anything too serious with it. Diving into it today was quite a pleasure. There are issues to be sure, but overall I’m enjoying it by far over any of the other options we had tried. I’m also enjoying learning/relearning Ruby.

If Pylons had good docs, we’d probably be using that.

So, I don’t know if this is a particularly useful post, since I didn’t get into much in the way of reasons (what, i have back this up?!). This subject’s been hashed and rehashed, but I just wanted (needed) to make a qualitative statement about my/our experience, which, of course, is purely personal.