"The API is simple because the problem it’s solving is trivial." I beg to differ...

lucozade · on Aug 7, 2014

I think that may be part of the author's point. The api isn't solving the underlying problem it's solving the api problem and they're different.

Quite often a lot of the implementation complexity is exposed even though there's no real need as far as the api is concerned. Largely because no-one really designed the api, it followed on from the implementation.

Not that I'm defending the implementation but I do have one quibble with the article. The author didn't need real time events but he used an rt api. Under those circumstances it's pretty hard to come up with a protocol that doesn't expose some of the complexity without incurring a penalty on either the system or the api. Having said that, the usual approach is to restrict what you can do so it can't cause damage in which case you can usual make a simple api.

angersock · on Aug 7, 2014

Well put.

A great example of this is the OpenGL API for fixed-function stuff (say, pre 2.0).

Simple glBegin, glEnd was easy to understand, but hid a lot of complexity. As new features were added (and, woefully, old calls supported and not deprecated) the API got more complicated and harder to use without trampling internal state.

APIs (usually) should be as simple as possible for most use cases, regardless of how gnarly the implementation is.

balloot · on Aug 7, 2014

When an engineer uses the word "trivial," what you should hear is "There are some complications that I'd rather ignore, so let's just hand-wave the answer".

to3m · on Aug 7, 2014

What are these complications? The API's job appears to be to let you iterate through a list of log items/read data from a log buffer/however you want to imagine it. That sort of thing is not rocket science, no matter how difficult it was to make that data in the first place.

(Besides, even if you don't think there's anything wrong with the way it provides the caller with data from the list, there's always the session nonsense to point and gawp at.)

wfunction · on Aug 7, 2014

> That sort of thing is not rocket science

Uh, for starters, the buffer doesn't have infinite size. It will overflow. What is the system supposed to do here? There are a million possibilities (discard old data, discard new data, allocate more memory, write to a file, call a callback, return an error, stall the rest of the system or halt the clock, etc.); some make sense, some don't. Between those that do, the user needs to be able to choose the best option -- and the time-sensitive nature of the log means you can't just do whatever pleases you; you have to make sure you don't deadlock the system. That's not by any means a trivial task, and I'd bet the reason you think it's so easy is that you haven't actually tried it.

to3m · on Aug 7, 2014

Yes, that's reasonable. But I'm not sure how this doesn't just boil down to configuring how the list is built up. You'd still be iterating through the list afterwards.

The system's hands are somewhat tied, I think. The events are building up in kernel mode, so it can't just switch to the callback for each one, not least because the callback might be executing already (possibly it was even the callback that caused whatever new event has been produced). So all it can do, when an event occurs, is add the event to a buffer - handling overflow (etc.) according to the options the caller set - though I don't think a callback is practical as this would involve switching back to user mode - for later consumption by user mode code. In short, it's building up a list, and perhaps the API could reflect that.

This is not to suggest that it would be easy to get to there from here. I've no doubt it could be literally impossible to retrofit an alternative API without rewriting everything. Just that I don't see why in principle an event tracing API can't work in some more straightforward fashion.

__david__ · on Aug 7, 2014

> What is the system supposed to do here? There are a million possibilities…

No, there are two: You dump old data or you dump new data. Everything else should be up to the user code. It's really not as difficult as you are making it out to be. There's certainly no excuse for a ridiculous API as described in the article.

wfunction · on Aug 7, 2014

Huh? If you dump data you miss events. Imagine if Process Monitor decided to suddenly dump half of the system calls it monitored. Wouldn't that be ridiculous? For a general event-tracing system, there have to be more options provided. Maybe it wouldn't matter so much for context-switching per se, but for a ton of other types of events you really need to track each and every event.

__david__ · on Aug 8, 2014

Yes, you miss events. But if you try to make build the kitchen sink into your low-level logging system then it ceases to be low level. If your logging system allocates memory then how can you log events from your VM subsystem? If your logging system logs to the disk, then how do you log ATA events? It becomes recursive and intractable.

The solution is to make your main interface a very simple pre-allocated ring buffer and have userspace take that and do what they please with it (as fast as it can so things don't overflow).

There is always a point at which your logging system can't keep up. At the kernel level you decide which side of the ring buffer to drop (new data or old) and at the userspace level you decide whether to drop things at all or whether to grind the system to a halt with memory, disk, or network usage.

wfunction · on Aug 9, 2014

The options are not simply "drop data" or "don't drop data". The options depend on the logging source, because not every logging source requires a fixed-size buffer. The API itself needs to support various logging sources and thus needs to support extensible buffers (e.g. file-backed sources, the way ProcMon does). Whether or not a particular logging source supports that is independent of whether or not the generic logging interface needs to support it.

__david__ · on Aug 9, 2014

I think we're talking past each other here. I don't think we're disagreeing on the userspace part. I'm not even implying that the the low level kernel interface should have unconfigurable buffer sizes. They should be configurable, but pre-allocated and non-growable. You're right, the userspace part can do whatever it wants. But I stand by my last paragraph (you either drop or grind things to a halt).

acdha · on Aug 7, 2014

> Huh? If you dump data you miss events. Imagine if Process Monitor decided to suddenly dump half of the system calls it monitored. Wouldn't that be ridiculous?

All sorts of systems have worked like this in the past (search for "ring buffer overwrite"). If you can't assume unlimited storage, you have to make a decision whether it's more important to have the latest data, dropping older samples, or whether it's more important to maintain the range of history by lowering precision (e.g. overwriting every other sample).

> but for a ton of other types of events you really need to track each and every event.

If you really need this, you have to change the design to keep up with event generation. That's outside the scope of a low-level kernel API where performance and stability trump a desire for data.

otterley · on Aug 7, 2014

At Sun, whenever an engineer claimed a task or problem was "trivial", the work was assigned to him or her.

A lot fewer problems were deemed "trivial" after that.

kabdib · on Aug 7, 2014

Well, the API in question (which I've used, and it was indeed an unpleasant experience) might not be solving something trivial, but it's certainly not well designed.

My all-time worse API is SetupAPI, which despite its name is how you get access to USB devices on Windows. It's . . . pretty miserable. Runner-up is the COM-based stuff that manages the Windows firewall, which is not well specified and has 'interesting' timing issues.

I have mercifully lost most of my memory of the Java stuff I was doing 15 years ago. That stuff made me hate life.

zyb09 · on Aug 7, 2014

My award goes to Extended MAPI. It took me weeks of trial and error just to read and send email messages through an Exchange Server. I remember people were selling 3rd party wrappers for the API, because it was so horrible.

pcunite · on Aug 7, 2014

I used the Extended MAPI API for years ... I thought there was something wrong with me as I struggled through the insanity ... until I saw modern APIs.

stavros · on Aug 7, 2014

Man, and my idea of an ugly API is urllib2...

lafar6502 · on Aug 7, 2014

Yeah, MAPI certainly deserved a mention

Dylan16807 · on Aug 8, 2014

The API is not the service. The service is solving a hard problem. The API is reading rather simple data from the service in batches. The API's problem is a trivial one.