Philippe on Software: January 2007

Saturday, January 27, 2007

Massive Scalability

Here is an interesting article about how to scale a web site from thousands to tens of millions of users. These guys have been redesigning their application over and over many times in order to get to the next level:

First generation: 2 web servers (ColdFusion), one database. Scaled up to .5 million users which is pretty good. The database is the bottleneck.
Second generation: more web servers of course, and split the database into functional areas (i.e. one database for login, one for blogs etc.), and use a SAN instead if the machines' local disks. Scaled up to 2 millions users.
Third generation: split the database by chunks of 1 million users, plus one last function-specific database for login. Scaled up to 10 millions. The limit was not related to the unique login database but was due to the fact that the databases were not loaded evenly. A ad-hoc approach for moving data between databases didn't work well because it became a full time job for several people.
Fourth generation: solve the uneven load problem by a SAN from 3PAR which allows to strip a volume across thousands of disks, so a single volume/DB can deliver much more IOs. Also add a caching tier between web servers and databases, and choose to store transient user data in memory instead of in the DB - a trade-off between reliability and performance (caching could have been done earlier). And finally they rewrote the application in C#/ASP which probably allowed developers to optimize their code. Scaled up to 17 millions.
Fifth generation: switch to 4 AMD dual core 64-bit/64 GB RAM machines for databases. They have 65 databases. Scales to 26 millions.

Interestingly also, this stuff runs on Windows machines (I guess security is not an issue). They discovered a few funny facts about Microsoft, for instance if you try to open more connections than the maximum that SQL server can handle, it crashes :) And that Windows has a feature which makes it shutdown if the frequency of incoming network connections exceeds a limit, because it think it is victim of a DOS attack (no pun intended).

Sunday, January 21, 2007

Elegance matters

I found this photo of a Bloomberg terminal on flickr. It has the reputation of being very powerful, in fact unmatched in terms of functionality, as well as lightning fast. It works with a command-line (sounds like Emacs!). You can see it in every movie that where some scenes take place in Wall Street or in the City. The last one I saw was A Good Year with Russel Crowe and Marion Cotillard (all London scenes show these terminals in the background).

Friday, January 19, 2007

Massive Project Failures

A friend of mine who works for a Big-5 consulting firm told me that she started a really big project for a company. She said that there is an army of consultants working on writing use cases, and that they plan to write use cases for a long time because there are hundreds of them. There is no developer on the project yet, but development will take place offshore when the use cases are finished. In the mean time, an architecture group is preparing the work for the future developers.

See any problem here? I do. It reminds me of another big project I worked on when I was also in a Big-5 consulting company. There was more than a hundred people on it. I gave 3 years of my life to this project, the client spent a fortune and I heard it is still not finished. The whole system was to be released at the end when all pieces are finished, instead of component by component.

Programme management is certainly no easy task, but sometimes it just takes common sense. Here is my advice to programme and project managers:

Rule #1: Hire The Right People

Your probability of success is a direct function of the skills and the motivation of your team. No amount of repeatable/optimizing processes will ever change this. Deal with it.

If you want success, hire experienced developers, which means spend the money on them. Be aware that good programmers have no trouble finding a job: they receive job offers every week regardless of the Economy. They will need motivation anyway because they will work 10-to-12-hours days and some week-ends. Of course if you manage to hire veterans you will need much smaller teams. I worked during a year and a half in a team of 12 programmers with an average of 15 years of experience, and I can honestly say that we produced more during this time than the hundred people during 3 years in the past engagement I mentioned earlier (a lot more, and our applications are used everyday by lots of people). Smart people attract smart people, so if you manage to hire a rock-star you are already half-way there.

My past projects succeeded because I was lucky to have a few good men in my teams, sometimes people that are way smarter than me. Here is a question for all project managers working with offshore development teams: have you even met your developers?

Rule #2: Evolution, Not Intelligent Design

Design your software for the next few releases only; you will adapt the architecture later if needed. A recipe for failure is an Architecture Team responsible for all design decisions, throwing application frameworks "over the wall" to the developers who are not empowered to make their own choices. I learned this lesson the hard way.

This rule is also known as: Release Often. If your project didn't put anything in production after 4 or 5 months, you have a serious problem. User feedback is critical to ensure that your application fills business needs, and for evolving the architecture. If you have to integrate mid-project releases with legacy systems so that your application can be used by the business, so be it.

The company I now work for has been one the most successful companies in America during the last 20 years. We must do something right... I noticed two things during my first few days: first, programmers are kings - they are amongst the most respected people in the company. Second, managers push you to get the product as soon as possible in front of the client, even if it is quick and dirty, so we can sell it and refine it. If clients like it we'll spend the money fixing it.

Thursday, January 18, 2007

String and StringBuffer

I have been doing quite a few interviews during the last 3 months, looking for senior C++ and Java programmers. We are looking for people who are smart and who can get things done. It is interesting to see how many people have "how to" knowledge but lack fundamental understanding about how computers work. This is especially true for people who only know Java (I recommend against hiring a senior developer who only knows one programming language, whatever this language may be).

For example, let's say you ask what is the difference between String and StringBuffer. Easy enough, they respond: String cannot be changed while StringBuffer can be.

Then I ask why they think Java is designed this way. The C++ STL has only one type for instance, called string (or wstring), and it is mutable. So why did Gosling and Steele choose to use two classes?

You would not believe the number of candidates with 10 years of experience who cannot answer this question: they look at me puzzled, not knowing what to do. And I am really trying to help them. "Just don't think about strings, what advantage can you see for immutable objects in general? In a multi-threading context for instance?" Sometimes the best answer I can get is that "it helps with performance" but without any details.

I actually don't know the details about Guy Steele's decision, but any answer like these would work:

Immutable objects are thread-safe by definition.
Strings are often used as keys in hash tables, so it is interesting to have an immutable string (otherwise you would need to copy the keys).
You can save memory by sharing the character array between different String instances (the String.intern() method).

Here is another example. First question: "In an operating system, what is the kernel?" (easy question). Second question: "So how does your application communicate with the kernel?" (no clue). libc/glibc, system calls, context switching are apparently very abstract concepts for the average programmer.

Monday, January 15, 2007

MediaWiki for project documentation

As Joel points out, "Writing specs is like flossing: everybody agrees that it's a good thing, but nobody does it". Programmers prefer to express themselves in code rather than documents.

I have seen two extremes in software documentation. I used to work in a Big-5 consulting company (it shall remain unnamed) where everything had to be documented and approved before the first line of code was written. This is partially because you need some protection against scope creep which will surely happen: no client knows what they really need before the project is almost finished. This is also because big consulting companies sell CMMI processes for a living, not to mention that they now all outsource the actual development to India, keeping only requirements, architecture and QA in-house (so much for time to market).

I think of CMMI as project managers spending too much time thinking about their work, rather than actually doing it. The equivalent for software architect has to be Model Driven Architecture, i.e. software programming without programmers. Interestingly both target the same market: your CIO and IT managers.

The other extreme is of course no documentation at all, not even functional specifications. "Our job is to write code, not documents!" Yeah, right. The only time Agile Programming does not equate cowboy coding is when the team is very small, composed exclusively of very senior developers and when time to market is the main business concern. In my experience, in any other situation Agile Programming becomes Fragile Programming, delivering pieces of software only maintainable by their authors.

There has to be a right balance between time to market, maintainability and reliability which is to be defined for each project. I personally like RUP but only if I can pick and choose from RUP what is really useful for my project (applying RUP without tailoring is nonsense). No RUP deliverable is really mandatory for all projects, not even Use Cases. What is needed really depends on the scope of the project, as well as on who you have in your team.

I recently experimented with MediaWiki to build a software documentation repository for all projects in my group. We used to keep all our project documentation in MS Office documents stored in Sharepoint, which doesn't provide more functionality than a hierarchy of folders on a shared drive (it actually stores documents in SQL Server and you can choose to browse or search with using the most unfriendly web interface or with File Explorer).

MediaWiki is the open source software that runs Wikipedia, so our project repository looks like a small Wikipedia web site. It is a breeze to install provided that you have a Linux box and a MySQL database. It works very well and everyone likes it:

It makes it very easy for everyone to write a note about how to deploy a piece of software or to document a process. It is also easier to find information.

You don't have to be as formal as with Word documents.

It's fun to use! (not like flossing)

Crays

I decided to start writing a Blog. Just a way for me to get some thoughts out of my system; most likely thoughts about software, and other things (absurdities, including software patents and perhaps politics if I get too upset reading the news one day). So here it is.

Here is a good link if you are like me fascinated by supercomputers, especially the ones "built by hand" from Seymour Cray and his team. It is a group of Cray fanatics (Crayons?) who collect old supercomputers from universities and research centers and put them back to life in a warehouse somewhere in Germany. They have many machines from Control Data, Cray, SGI and NEC. These machines are not up all the time because they wouldn't have enough money to pay the electricity bill, but most of them are accessible during the week-end. You can visit the computer room once a year. You can also get shell access to their machines, if you want to try a real vector processor. From the USA it is a bit slow but it works. Here is their CRAY Y-MP:

I read a good book about the life of Seymour Cray: The Supermen (nothing to do with David Bowie.). You won't learn a lot about computer architecture but rather about post-WW2 USA history and what not to do if you are an IT manager. Worth a read.

When Seymour learned that Apple had purchased a Cray to help them design the next Mac, he said "that is interesting, because I am using a Macintosh to design the next Cray".

Philippe on Software