I spent some hours last days browsing through Edward Tufte's nice book Visual Explanations. Although sometimes it gets on graphical issues way more complex than I normally need, it is a great material both on learning how to present results and on using data to support analytical thinking. So that I try to keep some of the lessons I just learned, I decided to keep a few notes here:

- On presenting data visually, ensure there is a scale and a reference for the reader

- It is often useful to look at data on scales one order of magnitude larger and smaller than the actual quantities

- Place data in an appropriate context for assessing cause and effect. This includes reasoning about reasonable explanatory variables and expected effects.

- Make quantitative comparisons. "The deep question of statistical analysis is compared to what?"

- Consider alternative explanations and contrary cases.

- Assess possible errors in the number reported in the graphics.

- In particular, aggregations on time and space, although sometimes necessary, can mask or distort data.

- Make all visual distinctions as subtle as possible, but still clear and effective. Think of elements in your displays as obeying degrees of contrast: if all (bg, axis, data, ...) have the same contrast, they'll all get the same attention. In particular, applying this to background elements clarifies data.

- Keep criticizing and learning from visual displays you find useful or not.

Amazon S3 had a major availability incident this Sunday and posted today a very transparent update on their blog about the causes of the problem and actions they are taking to prevent it from happening again.

From their report, it seems like a bit corruption in a control message (which ought to happen in a system of such large scale) combined with a gossiping protocol which spread (apparently) too much information across the system caused a mayhem in the server communication. When the engineers understood what was going on, they realized that the way to bring the system back to normal operation was to stop it and clear its state, what is popularly known as restarting it.

Lessons learned? Mainly, (1) if the scale is large enough, all kinds of bizarre behaviors will eventually show up and (2) having an efficient red button to bring the system to a clean state is very useful if you are running a long-lived system. (I'd add that spreading too much the state of the system is a trade-off between global knowledge and robustness, but this would lead to a lengthy discussion).

Interestingly, these are two lessons previously discussed by the operators of PlanetLab in a Usenix paper a while ago. PlanetLab has also experienced corrupted control messages (for which we typically do not do checksums) and implemented a red button which has already been used in at least one occasion, in December 2003.

Fubica pointed me to an interesting service today which I believe is an exciting exploration of business models that leverage commons-based peer production.

The service is FON, "a community of people making WiFi universal and free", accordingly to the website. The principle is that people buy a sharing-enabled router which has one secured channel (isolated from the owner's) that anyone participating in FON can use to access the Internet. This way, if you share some of your WiFi, you get access to other people's. An extra feature is that the resulting wireless network can be accessed by people not participating in FON, through a fee. The revenues of these accesses are shared between the FON company and the owner of the WiFi spot used.

What I find most exciting, however, is FON's business model. They are not the providers of the WiFi service themselves. Instead, they facilitate it by building the necessary technology (the WiFi routers) and provide an authority which eases access control, allocation of the exceeding resources (which always have a market potential) and billing.

However, I think what is new here is the type of the system and maybe the extent where this business model is being applied, and not the business model.

Looking at a bigger picture, providing enhanced governance for peer production systems is already being explored as a business model. After all, someone does profit from websites like digg and flickr. Thinking a bit more, money is spent to have Planetlab administrators running security among other aspects of the shared platform.

Considering these possibilities in a same framework opens some interesting questions: When is the service provided by an external (centralized?) entity necessary to enhance peer-production systems? How does introducing the economic valuation of the shared service affects the perception contributors have of the system and therefore their behavior? How sustainable is sharing in such conditions, as users start go game the system and try to profit from it?