Sunday, April 16, 2006

 

Kernel Problems in Apple Town...

This week's I, Cringely|Pulpit article again (surprise, surprise) includes a discussion about Apple.

I think most people will glaze over when it comes to his brief mention of Apple's 'kernel problems' (as he puts it) but it really made the hair on the back of my neck stand up.

For those of you who don't make a habit of following the trends of kernel innards, Mac OS X is built on what is a 'barely not experimental' kernel architechture called 'Mach' - the specific implementation is called XNU. Wikipedia is a great reference on this subject as usual:Mach, XNU.

One of the major 'drawbacks' that critics have made of the microkernel architechture in general is that it's 'slow'. What they mean by this is that it can often be 'slower' than other architectures (read "Linux") because it specifically forbids ANYTHING but the kernel to run in 'Kernel Mode". This means that ANYTHING that has to interact with hardware (memory, hard drive, network card, USB, etc, etc) has to first ask the kernel for access, LOCK the kernel preventing anything else from using the kernel, use the resource, and then release the kernel. All this switching things around takes up a little bit of overhead time, but it is SAFE because all these extra processes run in user space, and problems in user space don't affect kernel space.

Linux (and I think the Windows NT kernel does this too, though the NT kernel is considered a microkernel hybrid...) puts things like the network stack RIGHT IN the kernel (this is called a monolithic kernel), to make then FAST, but it makes them unprotected - anything that runs in the kernel space can bring the kernel down if a problem occurs.

Personally I'm of the 'nothing in kernel space except the kernel!!!' mind, but many people are in favour of more speed regarless of the cost.

In the early days of Mac OS X things were way worse than they are now. There is a concept called 'locking granularity'. 'Fine grained' locking means that every resource can be locked individually, wereas 'coarse grained' locking means that you have to lock everything up to use anything - it's a spectrum from fine to coarse and everywhere in between.

You might think the ideal situation is completely fine grained locking, so that absolutely every hardware resource can be locked individually and used without fear of conflicting hardware requests. BUT eventually you can get to a point where the kernel is doing so much 'book-keeping' to take care of all these locks and who has which ones and who's allowed to use which ones when, etc, that you get LESS performance with finer grained locking than if you had coarser locking and less overhead. The balance is tricky. Tricky-er with more hardware execution threads (hyperthreading, multi-core, multi-processor, all permutations of these).

It used to be that if you wanted ANY piece of hardware you had to lock the ENTIRE kernel, blocking anyone else who needed hardware at the same time. But this is kind of silly because if one process is using the hard disk and another program needs to use the network card there should be no conflict there.

The problem is that completely fine grained locking is very hard to do, and very hard to coordinate, and very difficult to maintain and debug. It's hard. Most importantly it's hard for third party DRIVER writers, like HP and Canon and Sony and Belkin and ... It's easy when you can assume the kernel only has 1 or two locks. Much harder when there's many locks and you need to get the right ones at the right time and release the right ones at the right time, etc, etc, etc... it can be a lot of pressure for a driver to handle. (read "it can lead to shitty drivers that lock up the whole OS just to write 10 bits to the network card")

Mac OS X has gotten considerably better since it's inception, but it's still not perfect, but a long shot. (I think there's something like 4 locks now instead 1, but that's still pretty coarse)

There's a great Ars Technica article about all of this in case anyone's interested in a bit more of a technical run down of how this stuff works. Especially check out the section entitled "concurrency".

Now, Bob Cringely says that Mac OS X has 'kernel problems'. I don't know how I feel about that. Yes, I think that Mac OS X is starting to be 'long in the tooth' engineering-wise in the kernel department. But I think it's been progressing quite nicely (as the Ars article linked above should point out). I don't think there's an engineer out there who would deny that Apple's approach (actually NeXT's approach, as this stuff was pretty much all decided way back in ~1988 when NeXTSTEP was written) is gutsy and technically bold. Technically brillant in many ways. They're heading off in a very different direction than other commercial kernel developers (Windows NT, Windows 'Vista' which was supposed to be rewritten but probably won't be now but will be in the future, Linux, BSDs, Solaris, everyone).

In actual fact, XNU is a hybrid between FreeBSD's kernel and a true Mach kernel. Plus I/O kit. It's weird. But good. (I think generally people love I/O kit)

So, is Mac OS X slower at network operations (as an example) than Linux and Windows? Probably.
Is it significant? Probably not...
Is it safer? Definitely.
Is it workable and/or improveable? For sure.

But, the big question is: Does Mac OS X have kernel 'problems' as Bob Cringely says they do?
I don't know. Maybe.

Maybe they'd like to go with a stright up FreeBSD kernel, with the I/O kit stuff on top.
Maybe they're writing a new one?
Maybe they're reimplementing lots of stuff to essentially remake the XNU kernel they have?

I don't know.

But I do know that what Mac OS X has is an architechtural advantage, and a technological differentiation, over the other kernels out there. It seems like it'd be a shame to give up on it now.

Comments: Post a Comment



<< Home