Skip navigation
Brigham Young University
Computer Science

Computer Science

Jonathan Ellis ('99)

Syndicate content
Jonathan Ellishttp://www.blogger.com/profile/11003648392946638242noreply@blogger.comBlogger181125
Updated: 36 min 59 sec ago

Blog Day recommendations

Mon, 09/01/2008 - 3:14am

As with many things in the blogging echo chamber, blog day takes itself a little too seriously.  But it's impossible not to love an excuse to talk about some of my favorite blogs that don't seem to have as much exposure as I think they deserve:

  1. Theo Schlossnagle, CEO of OmniTI, a scalability and performance consulting company.  His best posts deal with scalability at the ops level.  His book is good, too.
  2. Greg Linden, ex-Amazon engineer, ex-Findory founder, current MS Live Labs employee.  He likes to post analyses of interesting CS talks and papers, particularly in the area of collective intelligence.  Greg stays very on-topic so the most recent posts are about as representative as any.
  3. Chris Siebenmann writes about life as a professional sysadmin.  He also sometimes blogs about python.
  4. Josh Berkus, PostgreSQL core team member, mostly blogs about current events in the database world, but every once in a while he writes a must-read post about database design.  Google thinks that "Rules for Database Contracting" is his most popular post, and that's a good pick too.
  5. A non-technical pick: Eric Burns is the Gene Siskel of web comic critique.  Except he's not dead.

App Engine conclusions

Fri, 08/29/2008 - 3:16pm

Having been eyeball deep in App Engine for a while, I've reluctantly concluded that I don't like it.  I want to like it, since it's a great poster child for Python.  And there are some bright spots, like the dirt-simple integration with google accounts.  But it's so very very primitive in so many ways.  Not just the missing features, or the "you can use any web framework you like, as long as it's django" attitude, but primarily a lot of the existing API is just so very primitive.  

The DataStore in particular feels like a giant step backwards from using a traditional database with a sophisticated ORM.  Sure, it can scale if you use it right, but do you really know what that entails?

Take the example of simple counting of objects.  There's a count() method, but in practice, it's so slow you can't use it.  Denormalize with a .count property?  Yeah, that doesn't scale either: what you really need is a separate, sharded Counter class.  And yes, sharding is very, very manual.  (See slides 18-23 in the link there, and the associated video starting about 19:00.)  

You can't perform joins in GQL.  Or subselects.  Or call functions, aggregate or otherwise.  EVERYthing you are interested needs to be pre-computed.  (Or computed by hand client-side, which is so slow it's barely an option at all.)  I can extrapolate from this to my experience in production schemas and it's not pretty.

Of course, you also lose any ability to write declarative, set-based code, which is demonstrably less error-prone than the imperative alternative.  Take a simple example from my demo app.  Marking a group of todo items finished is four statements:

items = TodoItem.get_by_id(
   [int(id) for id in request.POST.getlist('item_id')])
for item in items:
   item.finished = datetime.now()
   item.put()

Compare this with SQL:

cursor.execute("update todo_items set finished = CURRENT_TIMESTAMP where id in %s",
              ([int(id) for id in request.POST.getlist('item_id')]))
Scalability is great but taking a big hit to back-end productivity is too high a price for all but a few applications.  GAE is still young, so maybe Google will improve things, but their attitude so far seems to be "we know how to scale so shut up and do it the hard way."  I hope I am wrong.

App Engine slides, code

Fri, 08/29/2008 - 12:11pm

My App Engine 101 slides and code are up now.

Bad news: my macbook pro did not work with the projector, period.

Good news: I have seen it do this before (in a room with several mac experts -- it was not user error) and brought a backup laptop.

Bad news: I forgot to include the django beta1 framework in my code upload, so I told people to just download it.  But beta2 was out, and didn't work with the version of App Engine Helper I had.  (It looks like r58 fixes this.)  Manual poking about the django download site ensued until I got a new zip uploaded.

Google App Engine at the Utah Open Source Conference

Mon, 08/25/2008 - 5:18pm

App Engine is probably the biggest thing to happen to Python this year, so of course I volunteered to give a presentation on it at at the Utah Open Source Conference.  (I'm scheduled for Friday, Aug 29, at 10:00 AM.)  Last year's conference was a big success, so I'm looking forward to an even better experience this year.

Here's the abstract I submitted, before they blew away my paragraph breaks:

Google launched the App Engine service earlier this year to immense interest from the web development community. App Engine allows running applications on Google infrastructure, including BigTable, Google's non-relational, massively scalable database.

App Engine is appealing both at the low end, where small shops don't want to have to deal with hardware procurement and systems administration, and at the high end, where the kind of "instant scaling" App Engine promises to deal with bursty traffic is the holy grail of infrastructure planning. This tutorial will cover the basics of App Engine development, including development and deployment of a simple application.

Please sign up for an App Engine account and download the SDK ahead of time so we can jump right in to the code. Basic Python knowledge will be assumed.

After I submitted the proposal, I found out that all presentations are going to be 60 minutes long.  That is not much time if we're going to do hands-on work, but you retain so much more by doing than you do merely from watching that I don't consider it optional.  So seriously, come with the SDK installed.  Those who do not, can look over the shoulders of those who do.

If you don't know Python and you're a last minute kind of person, you might want to attend Matt Harrison's talk the day before, 90% of the Python you need to know.  Matt has presented several times at the Utah Python User Group as well as PyCon.

Bonus tip: if you can't make it to the UTOSC, the two best talks on App Engine are Rapid Development with Python, Django, and Google App Engine  and Building Scalable Web Applications with Google App Engine.  My presentation will cover similar material to the first of these.

A reminder

Fri, 08/15/2008 - 3:47pm

Now that I've been doing Python full time again for a while it's easy to forget how magical it can be.

Last night I got an IM from a friend of a friend asking for (a) a recommendation for a Python book and (b) advice on writing a screen scraper.  I pointed him to Dive Into Python and BeautifulSoup.  Just now he IMed me again, "Hey, thanks for the tip.  I ended up writing a screen scraper that I hadn't completed in 2 days in Groovy in about 20 minutes last night in Python with BeautifulSoup.  So thanks, you got another python convert."

I love my job.

SQLAlchemy-Migrate for dummies

Tue, 07/22/2008 - 5:06pm

I'm gave sqlalchemy-migrate a try today.  I like it, and I'm going to keep using it.  The one downside is that it's a bit hard to find "the least you need to know" in the documentation, especially if you lean old-school like me and prefer to write your upgrade scripts in raw sql.  So here's my stab at it.

Create a "repository" for upgrade scripts:

migrate create path/to/upgradescripts "comment"

Create your manage script. If you have development/production dbs with different connection urls, create two scripts with the same repository but different urls:

migrate manage dbmanage.py --repository=path/to/upgradescripts --url=db-connection-url

For each database, create the Migrate metadata (a migrate_version table):

./dbmanage.py version_control

Create an upgrade script. This will create a script [next version number]-[database type]-upgrade.sql in the "versions" subdirectory of your "repository." That's all, so you could certainly do this by hand if you prefer, but letting the script do it is less error-prone:

./dbmanage.py script_sql sqlite

Edit the script.

For each database, apply the upgrade:

./dbmanage.py upgrade

Repeat the script/upgrade process as needed. That's it! Everything else is optional!

(What this gives you is a process where all your developers can have their own local database for development, and all they have to do is "svn up; ./dbmanage.py upgrade" without having to worry about which upgrade scripts have been applied or not.)

How to tell when you're successful

Sun, 07/13/2008 - 10:53pm

You're successful when someone tries to get a cheap clone of your site done on a cheap-labor code monkey site.

I'm flattered, I think.  (Although I'd be more flattered if it were a good code monkey site.)

Brief review of the Matias Half Keyboard et al

Wed, 06/25/2008 - 1:12am

I ended up buying four pieces of equipment to help deal with being temporarily one-handed: the Matias half keyboard, the X-keys foot pedal (cheaper than the Kinesis pedals, which got lukewarm reviews on Amazon), the Keyspan PR-US2 Presentation Remote, and the Pacific Outdoors 17-LC100 Folding Recliner.

The good: I'm very pleased with the recliner and modestly happy with the remote.  I got the recliner to take naps in; the brace on my arm didn't really accomodate lying down.  This $80 recliner compares well with zero gravity recliners costing over 10x as much.  (I've used two of the expensive variety; a BackSaver and one whose brand I don't recall.)  The only downside is you can either sit up, or recline fully; there is supposedly a way to adjust the recline angle, but it doesn't really work.  Expensive zero gravity recliners can all reliably lock at any angle you like.

The remote mostly worked as a mouse substitute that I could use with my immobilized right hand, reducing the need to slow down my left hand even more by switching from keyboard to mouse and back.  Unfortunately, the mouse control pad is not nearly as good as one of the IBM "pointing sticks;" it appears to have four control points, like an old Nintendo D-pad, which gives only 8 possible directions to move in.  This and a poorly quantized pressure sensitivity sometimes made things frustrating.  If I were to do this again I would try a handheld trackball instead, even though I could not find any wireless models.

The bad: the half keyboard did not help programming speed with one hand, and the foot pedal didn't improve things.  I've returned both.

The half keyboard gives you the left hand side of the keyboard, which toggles to the right side when the space bar is held down.  So "a" becomes ";", "f" becomes "j", snd so on.  For alphabetical keys, I found that it was true that I did not have to re-learn to touch type; I did not have to look at the keyboard, although I did have to pause and think, "does this one require the space toggle or not."  I got up to about 20 wpm before giving up, compared to 25 with one hand on a full keyboard.  I think I could have easily doubled that to 40+ wpm with enough practice to eliminate that pause and recognize "runs" of letters that can be typed without releasing the space, like "you," without thinking.  But that kind of investment wasn't worth it because of a serious flaw.

The half keyboard is really more like a "1/4 keyboard."  It only gives you the alphabetical keys and a couple punctuation marks.  No number keys with their !@#$ counterparts.  No F keys.  No arrow keys.  On a mac, you can have cmd or control but not both.

To allow these keys to be typed, there is a "numeric toggle" key that switches to keypad mode, and two other modes that you access by hitting "shift shift" and "shift shift shift."  Almost any line of code you might want to type is going to run into this.  Typing [0] for instance is shift shift s numerictoggle b numerictoggle shift shift a.  Even the symbol-averse Java will need parentheses for method calls, and yes, parens require mode switching too.  (As do braces.  Shudder!)

So I lost in the non-alphabetical and modifier access much more than I could see myself gaining on the pure alphabetical side.  

Finally, the modifier keys were on the right hand side of the keyboard where they very difficult to combine with shift.  I tried to ameliorate the modifier key problems with the X-Keys pedal, mapping the pedals to cmd/ctrl/option, but that didn't really work either.  (The included ikeys software wouldn't work at all.  At least ControllerMate worked in non-X applications, but since Wing is the only IDE that does locals completion well, using a non-X IDE temporarily was a non-starter.  Locals completion is nice with two hands, but absolutely essential with one.)  Note that this is more of an OS X issue than a problem with these pedals; apparently mapping pedals (x-keys or kinesis) to modifier keys works fine on windows.

So, the half-keyboard is not useful for programmers.  If it (a) were wireless and (b) had a non-skid backing -- it slid all over the place because the back side was just smooth plastic -- I could see it being useful for heavy smartphone users.  But it fails there too.  Good luck with this one, Matias.

Postscript: I considered trying the Frogpad as well as the half keyboard, but with users reporting that they got "up to 20 wpm after 2 weeks," it didn't sound worth the trouble.  So if I ever had to spend another three weeks one handed I am not sure what is left to try.  Probably I would try to use ControllerMate (os x) or xmodmap (linux) to make make a "half keyboard" in software that didn't suck so much, as suggested by one of the commenters in my first post.

One-handed typing?

Sat, 05/31/2008 - 8:19pm

I separated my right shoulder so that arm is going to be out of commission for a while.  (I am right-handed.)  I'm managing about 25 wpm with one hand, or about 1/4 my normal speed.  This is frustrating.  The Handkey Twiddler has been out of production for a while.  The BAT is not OS X compatible. Anyone tried the Half Qwerty keyboard?  Are there other good options for under, say, $300?  (I found several very niche products for significantly more.)

I do plan to try voice recognition for email and IM but I can't see that working very well for code.

Jython Notes

Mon, 05/19/2008 - 11:22pm
I've been getting back into the Jython codebase this last week. The last time I submitted a Jython patch was in the beginning of 2004, so it's been a while. Things have changed... Jython is finally requiring Java 5 for the next release, which means the usual improvements, but especially good use of annotations. Here's some notes from my puttering around (mostly dragging Jython's set module up to compatibility with CPython 2.5's):
  • Expect Eclipse to be slightly confused. (Lots of "errors.") This is normal. Use ant to build.
  • ant regrtest is handy. run it before you start making changes so you know what's already broken in trunk. (At least between releases, jython does not appear to be religious about "no tests shall fail." But as a new developer you should make "no additional tests should fail" your motto.)
  • Subjective impression: Jython re performace is a bit slow. Jython uses its own re implementation predating the Java regular expressions in jdk 1.4. But, the JRuby guys reported that the jdk implementation doesn't perform very well, so Jython hasn't been in a hurry to switch. The JRuby solution was to port the oniguruma re engine from C to Java. But, Ruby's strings are byte-based and mutable where Jython's are not, so using the JRuby engine isn't just a matter of dropping it in. Also, these string differences may be a source of the poor performance the ruby people saw, so independant testing is in order here.
  • All of the Derived classes (PySetDerived, PyLongDerived, etc.) just exist to let python code subclass builtin types. Those derived classes are generated by a .py script in src/templates
  • If you add a Java class that needs to be exposed to python using the @Expose annotations, you need to add the class name to CoreExposed.includes, or Jython will default to picking attributes via reflection and it usually guesses wrong.
  • Given a PyObject, you can (usually) easily instantiate another PyObject of the same class with pyobject.getType().__call__(). The only times this won't work is when your type's __new__ does something tricky, like how PyFrozenSet or PyTuple return a singleton for an empty frozenset or tuple.
Thanks to all the people in #jython who helped me out, especially Philip Jenvey!

User login

Powered by Drupal. Maintained by Webmaster.

Copyright © 1994-2006 BYU Department of Computer Science. All Rights Reserved.