Bike data analysis talk

July 20th, 2014  |  Tags: , ,  |  Leave a comment

I gave a talk at Spark Summit earlier this month about my work using Apache Spark to analyze my bike power meter data, and the conference videos are now online. You can watch my talk here:

If you’re interested in seeing one of these analyses in action, I’ve also made a short video demo:

Making sense of bicycling data

April 2nd, 2014  |  Tags: , , ,  |  Leave a comment

2014 04 02 at 8 46 AM

I’ve been working with Apache Spark a lot lately and recently wrote some code to analyze and visualize bicycling telemetry data with Spark. I’ve posted a more detailed writeup (including an explanation of what the above picture means) over at my work blog.

  • In an ideal world, the existence of this aggressively stupid law would open more eyes to how many avenues for tyranny were opened by the 1998 passage of the Digital Millennium Copyright Act. But in our world, people will probably just hope that the law won’t be enforced. (Derek Khanna, who wrote the linked piece and is currently famous for proposing sensible copyright-law reforms and then getting fired, has been absolutely crushing the technology-policy front lately.)


Happily ever after

November 17th, 2012  |  Tags: , , ,  |  Leave a comment

In December 2011, Apple filed for a patent on page-turning animations in software. This patent was granted last week. Now there’s nothing left to keep Apple from suing MicroIllusions into oblivion:

(I’m still trying to discover whether or not Apple has also patented a system and method for ensuring that a user has the map from the original packaging before allowing him to play the game.)

  • Avid’s death spiral continues and has claimed the development team for Sibelius. This is terrible news. I was very good with Finale in college but switched to Sibelius for my (presently very limited) music-typesetting needs after I got my first Mac in 2002. The notation software world, which is not a likely candidate for disruption, will absolutely suffer without spririted competition.


The end of passwords?

August 1st, 2012  |  Tags: ,  |  Leave a comment

My pal Ben Brown (who has known me so long that he remembers a time when I could vote without antiemetics) has an interesting proposal to manage login credentials; Ben begins by describing a pattern that I’ve absolutely used for some infrequent-login sites:

My personal solution to the too-many-password problem is to use completely random, automatically generated password when I create an account. Most websites will allow me to stay logged in forever, and on the odd occasion that I need to log in again, a password reset tool will send a link to your email account that will allow me to login again. This way, I don’t really have a password, but I can always gain access to any account, as long as I still have access to my secure email account.

His solution is to eliminate passwords altogether and email a unique, expiring login link to users when they wish to log in. Read the whole piece (and his followup) for the argument, which I find convincing.

In fact, I used a variant of this approach for SVP, a service I developed because I hate Evite and wanted to invite people to my birthday party in 2007. (After all, most of my friends are too popular and sophisticated to be particularly happy about managing credentials for a one-off site that some curmudgeon made to avoid using the ubiquitous alternative.) When I’d invite people to events, they’d get an email with a link that would log them in to RSVP for that event. Users could set passwords, but the site interaction model was designed to never require them. It was pretty successful on a (very) small scale: I had around 50 invitees/users and probably ten events before I stopped using the service, but everyone who wanted to come over seemed to be able to reply and no one complained about it to my face.

Visualizing last summer’s road cycling

January 3rd, 2012  |  Tags: ,  |  Leave a comment

I recently plotted some of my local road bike rides from last season using TileMill, the TIGER map data from the US Census Bureau, and exported activity routes from RunKeeper. I had a pretty good sense of what ride I do the most often (a fairly flat 18-20 mile out-and-back that I can complete in under an hour unless car traffic is awful — great for time-constrained rides and intervals), but I was interested to see where else I’ve gone. The results turned out pretty well, so I’m posting them here. I only plotted road rides in Dane County, only rides on a geared bike (i.e., no commutes), and I chose only a subset of all my rides. The paths are lighter or darker proportionally to how frequently I traveled them.

Arboretum detail

The figure above is a close-up of the section of the map including the UW Arboretum and the Capital City Trail. This was by far my most common ride in 2010, but I did it much less frequently in 2011. (In fact, I think I rode this route more frequently on my fixed-gear than on my road bike in 2011.)

B11 paoli

This is a detail of some of my favorite short hill-climbing loops near Paoli, WI. The loop to the lower left is more challenging (and more rewarding) but I didn’t do it as often. To the upper left is the beginning of a fast and fun route to Mt. Horeb, WI that also serves as the beginning of the WI Ironman cycling loop. I’m hoping both of these will be substantially darker at the end of next summer!

Biking sm

Finally, here’s the whole map, cropped to include Madison (for context) and the parts of western Dane County that I actually rode in.

I did these manually, so the obvious next step is to write up a little program to generate these automatically. I’d also like to have a more interesting visualization (like making paths thicker instead of darker or perhaps incorporating elevation and average speed data somehow). Overall, though, I’m pleased with these results. I was quite impressed with how easy TileMill was to use, and am optimistic that this toolchain, combined with some additional cleverness and care, could produce a really compelling presentation of these data.

  • A nice explanation of the difference between open source and so-called “Open Core” licensing: “What is most important to understand about an Open Core project is that it has nothing to do with an open source project. If you are depending on a single closed source component then you have to regard the whole project as a closed source project as you lose all the benefits of open source.”


Misinterpretations of user-interface freedom

March 29th, 2011  |  Tags: ,  |  Leave a comment

This gallery of hideous screenshots of mobile devices might make one nostalgic for 1997-vintage galleries of “WinAmp skins” or “Enlightenment themes.”1 In fact, it’s actually like some of those old screenshots were placed in suspended animation and reconstituted as 480×800 JPEGs with wireless carrier information: the anachronistic speculative-fiction movie references (“my phone is, like, in The Matrix, dude”), the Star Trek-inspired UI chrome, and the gratuitous misogyny all recall an era when the only thing nerds demanded from a computing platform was unfettered freedom to make their devices look as ludicrous as possible.

(Link via DF; see also more on the “misinterpretations of freedom.”)

1 Amazingly, both of those links are current: not only can you still download both WinAmp and Enlightenment, but people are still apparently furiously “theming” both.

Durability and performance

June 23rd, 2010  |  Tags:  |  Leave a comment

Wolf Rentzsch shares an amusing anecdote about “high-performance” database systems whose developers claim to achieve high performance by eschewing sync calls. Recall that the sync system call is the one that ensures that the bits that you’ve just written to disk actually made it to the disk. It essentially trades the performance afforded by multiple layers of caching in modern disks for the reliability of knowing that no data is still in-flight when the call returns.

Of course, the no-sync approach doesn’t go nearly far enough; I bet these database developers could improve performance still more by avoiding writing anything to disk in the first place.

Flash’s inaccessible installer

June 18th, 2010  |  Tags: , , , , ,  |  Leave a comment

Daring Fireball links to a report that the installer for the latest version of Adobe Flash is incompatible with OS-level assistive technologies on both the Mac and Windows. So if you need, for example, a screen reader to interact with a computer, the standard Flash installer will just look like an empty window to you. On the plus side, you won’t even have to pretend to read the EULA.

I rarely miss an opportunity to enjoy Flash-related schadenfreude, and am completely in favor of any criticisms of Adobe installers, which generally resemble the sort of software provisioning technology that might have been designed by mid-level bureaucrats in Soviet satellite states. But I’m also reminded of the accessibility concerns surrounding cash machines in the mid-1990s. Isn’t it a little silly to complain that visually-impaired users won’t be able to use the inaccessible installer for the latest version of a browser plugin that exists exclusively to render inaccessible web content?

The data buffet

June 2nd, 2010  |  Tags: , ,  |  Leave a comment

I’m glad to see that I will have the option to cease subsidizing the heaviest 2% of data users on AT&T’s network. If you would have asked me two years ago — before I got a phone that I actually wanted to use on the internet — I would have regarded a bandwidth cap as anathema, a step backwards even from the endless nickel-and-diming I experienced on Verizon’s data network.

But since getting such a phone — and, so I thought, using its data capabilities fairly heavily — I have never used more than 200 megabytes of cell network data in a month; Andrea has never used more than 100 megabytes. In the last seven months (charted below from my online AT&T bill), I haven’t even come that close to 200 megabytes, If we choose to switch from “unlimited” bandwidth to the new AT&T plans, we will save $30 per month. (We also have 2.5 days of “rollover minutes” for voice, but I suspect that we will have to continue to subsidize heavy voice users to some extent.)


Is this how you want to sell your company?

May 8th, 2010  |  Tags: , ,  |  Leave a comment

Apparently the company that runs the Stack Overflow web site recently secured some funding, and the founders have been characteristically glib about the possibilities enabled by their extra capital.

There’s a good post at the 37signals blog that details why it’s probably a bad idea for Stack Overflow to take VC (or, at least, why their stated reasons for taking VC don’t make sense), but it seems that the founders must already know that this is the case. In their blog post announcing the funding, the Stack Overflow gang describe it as follows:

So we created Stack Exchange to bring the technology behind Stack Overflow to a much wider variety of sites. We tried charging for Stack Exchange, and that didn’t work so well. So we asked ourselves, “How would the people of 1999 solve this problem?”

And the best answer we could come up with was, let’s make the damn thing free, and get some VC somewhere to pay for it.

Indeed, this business model sort of made sense to me in 1999. Except then, “make the damn thing free” was called “release your core product as open source and run a consulting business,” and “Stack Exchange” was called “The ArsDigita Community System.” Of course, there are important differences: for example, ArsDigita was profitable. But for a lot of the 1999-vintage businesses that took funding without a plan for profitability, “get some VC somewhere to pay for it” didn’t end so well for anyone involved: founders, investors, or engineers.

(I’m not suggesting that Atwood and Spolsky’s actual business model is “coast on external funding until we can magically discern some way to become profitable.” But if you’re making flippant comparisons to the business climate of 1999, then serious comparisons to the business climate of 1999 should at least have occurred to you.)

Is this how you want to sell software?

May 6th, 2010  |  Tags: , , ,  |  Leave a comment

Most of the software I write is released under permissive open-source licenses, but I’m sympathetic to people who want to make a living selling licenses for proprietary application software. I can’t understand the “bundle sales” phenomenon that has been widely-touted in the Mac world for the last few years. A lot of people have already written about how these bundles are a bad deal for developers and users, and, indeed, that they only work out well for the bundle promoters.

Furthermore, a lot of bundles seemingly rely on aggregating a huge number of low-quality programs into one inexpensive package. If I had a decent program that I was hoping to license to end-users, I don’t know why I’d want to put it in a flea market of hyperspecialized, half-baked programs of dubious utility. (This tactic is also antithetical to the sensibilities of the stereotypical Mac user. Indeed, if I had wanted a lot of shovelware that I’d likely never use, I’d just have bought a Vista-ready notebook from Best Buy, since these typically include shovelware preinstalled gratis.)

The truly remarkable thing, though, is that the bundle promoters seem to be embracing this flea market mentality. Take a look at the following image, which I cropped from a bundle-sale web page that someone mentioned on Twitter this morning:


This image of n application icons crammed into a cardboard box doesn’t say “these are quality tools that you will find useful and valuable.” Instead, it says “this box of junk didn’t sell at the yard sale and the weekend is almost over. Don’t look too closely, but you can have it for $20.”

Word dies; irony hardest hit

August 3rd, 2009  |  Tags: ,  |  Leave a comment

Jeremy Reimer wrote an article for Ars Technica claiming that Microsoft Word is dead. Some incidental aspects of his argument surely deserve additional scrutiny (e.g. the “people prefer software with more features” claim), but the main thrust is that Word is dead because documents now appear on the web instead of in print, or something, and there are better formats and tools for writing for the web:

What everyone had lost track of in the heat of battle was why we were still using Word (or OpenOffice Writer, which is—let’s face it—just a clone of Word) to create documents that were likely never going to be printed.

Word, to this day, is still largely a digital representation of a bunch of 8½ by 11 pieces of paper. Pages have numbers which you must use to reference them, and every page has a header and a footer. Word does have a display mode called “Draft” that makes it look more like an endless stream of toilet paper than separate pages, but I always switched to “Print Layout”—partly because Draft was so ugly, but mostly as a kind of unconscious reflex, a need to “know” what the printed form would look like even though I was rarely printing things out any more. Even in Draft mode, the pages are still there, and are always the same size.

One almost hesitates to point out that Reimer’s article — as it appears on the Ars Technica web site, unfettered by the antiquated constraints of physical media — is paginated. Furthermore, each page has a header, a footer, a sidebar, and a number. Perhaps some constraints die harder than we might wish.

Why Safari is better than Firefox

July 27th, 2009  |  Tags: , ,  |  Leave a comment

In one sentence: Safari was designed by the organization responsible for the iPod, but Firefox was designed by the organization responsible for the “about:config” page.

Virtualizing a physical Linux machine

March 26th, 2009  |  Tags: , , , , ,  |  Leave a comment

Due to some hardware trouble with my main work machine, I’m presently working in a virtual machine on my personal computer. After a few dim trails, I found a pretty straightforward method to clone my work computer into a virtual machine image, so that I am able to work in the exact same environment I would have on my physical work computer. Here’s how to do it:

  1. Clone the drive using dd (the following example assumes your drive is /dev/sda and you have an external drive mounted at /media/removable:
  2. Use qemu-img to convert the raw bits of the drive to an image in the appropriate format for the virtual machine monitor you want to use (QEMU or VMWare):
  3. Create a new virtual machine that uses this drive image, using the interface for your preferred virtual machine monitor.

I was able to image and convert a 100 gb drive in around six hours. My drive was an LVM volume and the home partition was encrypted with LUKS; I was delighted to see that qemu-img handled these oddball features of my drive flawlessly. (I can’t think of a technical reason why these wouldn’t be supported, but I’m nonetheless inclined to be pleasantly surprised when things work as they should out of the box.)

Cross-posted at Chapeau

Abuse of language notes

October 20th, 2008  |  Tags: , ,  |  Leave a comment



I don’t know why, but this kind of thing — stilted passive-voice dialogese combined with numbingly inane neologisms — always cracks me up.

Design and “looking good”

July 25th, 2008  |  Tags: ,  |  Leave a comment

According to Gina Trapani, writing for Lifehacker, Ubuntu honcho Mark Shuttleworth is interested in improving the Linux desktop experience:

Ubuntu founder Mark Shuttleworth (who we interviewed last year) announced that he’s out to make Linux a better-looking operating system than Mac OS X—within two years.

Trapani then asks whether or not a “better-looking” Linux would motivate switchers:

Everyone loves eye candy on their desktop — Apple’s record-setting Mac sales can attest to that — but is looks is the main hurdle for Linux adoption amongst Normals?

This is notable, since it exemplifies a pervasive way to completely miss the point. People don’t use Apple’s computers because they’re pretty or feature “eye candy.” People use Apple’s computers because they work well. Design is not about how something looks; design is about how something works.

Read the rest of this entry »

Pragmatic [REDACTED]

July 23rd, 2008  |  Tags: , , , , ,  |  Leave a comment

I suppose this is why the Core Animation book I pre-ordered from Amazon in March remains unshipped a week after its expected release date and why Amazon sent me a panicked “we have no idea when this will ship — do you still want it?” message.

In cheerier Pragmatic Programmers news, I watched some of their Erlang screencasts on a recent plane trip and am glad to endorse them. They’re certainly not a substitute for Armstrong’s excellent Erlang book, but they’re a nice taste of some very cool features of the language. I learned of these via DF, whose one-sentence summary of Pragmatic’s products hits everything I love and loathe about them. (Seriously, Bookman makes my skin crawl, and it’s only the beginning.)

(Confidential to readers who appreciate the idea of evaluating technical books on both content and typography: have I got a treat for you, and soon!)

38.4 percent

June 13th, 2008  |  Tags: , , , , ,  |  Leave a comment

The most amazing part of this otherwise-unremarkable NYT piece about the Yahoo-Google ad deal is the following paragraph:

While Google had 61.6 percent of the search market in the United States in April, according to comScore, Mr. Wiener said that Google’s dominance of search ads is even greater. Among 360i advertisers, it accounts for 75 percent to 80 percent of dollars spent, he said.

The 61.6 percent figure seems bafflingly small to me. I don’t want to get all Pauline Kael here, but who are the people who are responsible for the other 38.4 percent, and what on earth are they using for internet search?

Soot 2.3.0

June 4th, 2008  |  Tags: , , , , ,  |  Leave a comment

Here’s a shout-out to the Sable group at McGill. They’ve just released a new version of the Soot compiler framework, which I’ve used extensively in my dissertation work. If you need to analyze or transform Java source or bytecode, I can’t recommend it highly enough.

Sold, not licensed

May 30th, 2008  |  Tags: , , , ,  |  Leave a comment

Here’s a fascinating ruling from the US District Court in Seattle, indicating that the transaction by which someone acquires a copy of a software package and the legal right to use the same is a sale and not a license:

[A]s Vernor’s lawyers pointed out, the distinction between a lease and a sale is based on the actual characteristics of the transaction, not merely on how the transaction is described by the parties. […] AutoCAD customers pay a lump sum at the time of purchase, with no obligation to make further payments or to return the software at the conclusion of the supposed lease.

As a consequence of this, the first sale doctrine applies, and Autodesk is unable to prevent customers from disposing of copies of AutoCAD by transfer or sale. I can’t imagine that this won’t be tied up in appeals for at least a decade, but I’m reeling at the implications.

Let me share a brief anecdote: A dumbed-down version of Propellerheads Reason came with my first real audio interface. It was crippled in nearly every way and basically served more as an advertisement for the real product than as a productive tool. I’m sure one could have used it to make real music, but I didn’t; I played with it for a little while and then shelved it. One day, after installing some additional memory in my powerbook, I tried running this fractional Reason again. It demanded that I re-authorize, since I was “running on a different computer.” This level of draconian copy protection — on, essentially, a piece of shovelware — was enough to get me to drag the Reason folder to the trashcan and never think about it again.

My initial reaction to this ruling is: “well, it would be nice if all of these weird special-cases for copyright as it relates to sequences of bits were abrogated,” but I think the future is probably a lot darker. If “no resale” provisions are unenforceable, then it seems that the copy protection schemes for commercial software are about to get a hell of a lot more onerous. You think that you shouldn’t have to tie a serial number to a particular machine or authorize on-line? Wait until you have to tie your license to a statistical model of your typing patterns, or re-authorize online every time you start the application. You think that a scheme that sees a RAM installation or operating system upgrade and says OMG WTF THIS IS A TOTALLY NEW COMPUTOR is ridiculous? Wait until you lose your authorizations by switching to a different wireless network, or installing some new user-space applications. (This is not too far off, as those of us who remember MAC address-as-machine-ID schemes know….)

Think about two other things that were “licensed, not sold” before this ruling: DRM-infested digital media and fonts. In the case of subscription-model or rental digital media, this ruling appears not to apply — since those are transferred via a transaction that does not resemble a sale. In other cases, though, like iTunes movie “purchases,” or pay-per-song music downloads, one would have to circumvent some DRM in order to resell a song or movie. Therefore, it seems that the first sale doctrine (as re-established by this case) conflicts with the DMCA, which prohibits circumvention of copy protection (and does not have first sale, fair use, or even “hey, this copyright is expired” provisions). Of course, one could argue that it is fine to resell the bits constituting a digital download — they’d just be useless to anyone other than the original purchaser.

It’s perhaps more interesting to consider how DRM-free downloads (like Amazon MP3 or iTunes Plus) are affected, since there is no DMCA conflict here, and these are sold under conditions that explicitly forbid resale. An even greater version of this conundrum comes up with commercial fonts, which not only prohibit resale but are licensed with a whole host of restrictions ranging from more-or-less reasonable (don’t copy our font files for your printing company) to mildly outrageous (don’t actually use these fonts to produce documents or designs that anyone else can see). Of course, people who abide by these licenses do so because (1) hey, we agreed to this and it’s the right thing to do and (2) our license to use this font will be revoked if we don’t, rendering our investment worthless. If the transaction in which, for example, I give some money and they send me a bunch of weights of some nice face is a sale and not a license, though, that seems to impact point 2.

On web standards compliance

May 7th, 2008  |  Tags: , , , , ,  |  Leave a comment

Mark Pilgrim on the Mozilla project’s reaction to Firefox’s ACID 3 scores (which are lowest among all browsers not named “Internet Explorer”):

[M]an, you should all be embarrassed with yourselves. But you’re not, so here I am stepping up, publicly being embarrassed on your behalf. No need to thank me.

It’s kind of like losing a board game and then loudly claiming that you weren’t trying anyway.

A handy one-liner

April 24th, 2008  |  Tags: , ,  |  Leave a comment

I used a catch-all email address for quite a while on This is great if you use a lot of throwaway addresses (i.e. for obnoxious compulsory web registrations), but the amount of spam is truly oppressive. I recently axed the catch-all address, and thus had to answer the question: “which email addresses have I received useful mail at?” Here’s a handy one-line script that will show you all the email addresses you’ve received mail at, assuming you use Apple’s

grep Envelope-to: `find $HOME/Library/Mail -name \*.emlx` | cut -f2 -d\ | cut -f1 -d, | tr A-Z a-z | sort | uniq

(This will work with minimal changes in other mailbox formats — you’ll probably only have to change the find part.)


April 18th, 2008  |  Tags: ,  |  Leave a comment

Rob O’Callahan on rounding:

Unfortunately both [rounding towards zero and rounding away from zero] can get us into all kinds of trouble when we’re rounding values for use in graphics.

Nerdtacular bumper sticker

June 3rd, 2006  |  Tags: ,  |  Leave a comment

Doug Wyatt (of CoreMIDI fame) posts about a great bumper sticker on his blog. While my animus for bumper stickers is well-documented, I have to give a shout-out to such a nerdtastic display.

I’m currently listening to At Least Some Knots Get Untangled from the album “Heaps As” by DJ Olive