Wednesday, November 5, 2008

Jury Duty: Part Deux

In my second day of jury duty. The upside is that if I make it through today then I am done.

Spent most of last night investigating Kaffeine's buffering model, since I am trying to figure out why the video is so crappy. A few obvious missing exception handling cases aside, it looks like this if a more fundamental problem.

Kaffeine has three threads of execution that are relevant here:
  • Thread 1 reads the /dev/dvb/adapter0/dvr0 device file and inserts the MPEG packets into a local buffer
  • Thread 2 reads the local buffer and pushes the data into a pipe (and optionally writes it to disk to)
  • Thread 3 is the instance of Xine that reads the pipe and renders the stream to the display
Note that in the main data path, if you are watching live tv, the data is never written to the disk. So if the system stalls for even a short period, you will lose packets.

After adding a bunch of comments, I can see two basic failure cases:
  1. In some cases the thread reading the device file gets overflow errors from the kernel, indicating that it is not reading the device file fast enough
  2. In some cases, the buffer pool populated by the device thread fills up before the thread that services the pipe can read it out. As a result, the packets are not inserted into the buffer.

Obviously, making the buffer pool bigger might reduce the risk of overflow, but the first condition above is more troubling. It suggests that a thread whose sole responsibility is to read the device file cannot keep up with the data flow.

The thread in question actually works as follows:
Poll call on device fd
if (data available)
read(fd, buf, 188)
copy data from buf to memory buffer of size (188x8)
if (memory buffer full) {
announce (8x188) bytes to second thread

Now, if you do the math, you might appreciate why this might be a bit disturbing. If you do one read call per mpeg packet, on a 19Mb stream that amounts to almost 13,000 read() calls per second. Of course there can be multiple elementary streams within that, so it might only be around 6,000 read() calls per second. That's still a hell of a lot of read calls.

It's also possible that there is some other sleep() call or some equivalent that stalls the pipeline, but I haven't found one yet.