When we last left off, we had reproduced the basic behavior that users reported, using the stock command line tools such as v4l2-ctl and “mplayer /dev/video1″. Now we’ll talk a bit about the debugging process.
To give some context, let’s look at a quick diagram showing the various parts of the device:
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/hvr_1800_block.png”][/lightbox]
(click to enlarge)
For starters, let’s now look at what the actual user sees after running the following commands:
v4l2-ctl -d 1 -f 62.25 --set-fmt-video=width=720,height=480,pixelformat=mpeg mplayer /dev/video1
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_1.png”][/lightbox]
The video feed itself is a set of colorbars, a well known reference video source. As you can see, the video is corrupt in a number of ways: there appears to be a basic formatting problem, as well as the color being missing. Now note this is on the tuner input, so the problem could be in a variety of different places in the pipeline: it can be the tuner itself not tuning, something wrong with the analog video demodulator, the video decoder, or the MPEG encoder. However if we try the same test through the composite or s-video input, we can rule out whether the problem is with the frontend (the tuner and analog demodulator). So let’s run the following and see what mplayer puts out:
v4l2-ctl -d /dev/video1 --set-input=2
Running this command and then mplayer shows us the following:
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_svid_cb.png”][/lightbox]
Look familiar? It’s *almost* the same as the previous image. This suggests that the basic video corruption is somewhere either in the video decoder or MPEG encoder. You’ll also note that the first image from the tuner input shows quite a bit of static, while the feed through the s-video input is very clean. We’ll get back to that later (as it suggests there are in fact multiple problems).
For now, let’s stick to debugging on the s-video input. Once we’ve fixed whatever is wrong there, the situation may just improve significantly for the tuner input.
Now TekDoc was kind enough to already have done some bisecting, and found that the following patch causes the formatting bug to go away:
+++ b/drivers/media/video/cx25840/cx25840-core.c
@@ -661,7 +661,7 @@ static void cx23885_initialize(struct i2c_client *client)
* - enable raw data during vertical blanking.
* - enable ancillary Data insertion for 656 or VIP.
*/
- cx25840_write4(client, 0x404, 0x0010253e);
+ // cx25840_write4(client, 0x404, 0x0010253e);
/* CC on - Undocumented Register */
cx25840_write(client, 0x42f, 0x66);
This code change in fact reverts a change Steven did to make VBI work. While we know what register 0x404 is being set to which causes the corruption, we don’t know what is actually different since we don’t know what the register used to look like. Fortunately, that’s where v4l2-dbg becomes a *very* handy tool:
v4l2-dbg -d /dev/video1 --chip=0x44 --list-registers=min=0x400
ioctl: VIDIOC_DBG_G_REGISTER
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000400: 00 60 00 00 31 25 10 00 00 80 00 00 00 87 08 00
00000410: e7 43 ff ff 03 7d 13 00 80 80 00 01 00 00 00 00
00000420: 0f 3e 1c 00 7a 00 2d 5b 1a 60 1e 1e 00 00 60 42
00000430: 9b 03 00 00 00 00 00 00 00 00 00 00 06 00 f0 ff
00000440: 24 08 03 08 00 00 00 00 00 00 70 81 00 10 1f 16
00000450: 02 08 00 00 0d c4 08 26 77 88 00 54 00 00 00 00
00000460: 02 14 0a 34 6e ca 36 06 e7 00 00 08 20 f6 84 02
00000470: 84 00 2d 5f 22 00 24 26 1f 02 58 60 63 82 0a 01
00000480: 91 00 00 00 00 00 83 42 21 ff 2f f8 dc 40 10 00
00000490: 8a 02 3f cd 00 03 1f 16 40 20 00 00 14 00 50 14
000004a0: 0f 02 0c 00 00 00 00 00 00 00 00 00 00 00 00 00
000004b0: 00 00 00 00 04 00 00 00 0a 14 14 00 00 00 00 00
000004c0: c8 0a f0 c0 00 00 00 00 00 00 00 00 00 00 00 00
000004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000004f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
This dumps out the register state for the video decoder core built into the cx23887 (commonly known as the Mako core). The Mako has been used in a number of Conexant designs, and you will see huge similarities between the cx25840 (a standalone Mako) and chip designs which have the Mako embedded into it (such as the cx23885 and cx23123). Hint: the cx25840 datasheet is publicly available on the Internet even though the cx23887 datasheet is not.
The important thing at this point though is that we can see that register 0x404 is 0x00102531 at power up (once we comment out the line in the driver which changed it). Do some bitwise arithmetic, and you will see the difference is the first byte changed from 0x31 to 0x3e (bits 1 and 2 changed).
Just as a quick test (and without having to change the driver source and rebuild/reinstall), let’s just poke the register:
v4l2-dbg -d /dev/video1 --chip=0x44 --set-register=0x404 0x31
and the resulting video?
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_svid_404fix.png”][/lightbox]
The color is still screwed up, and there are weird vertical bars, but this is *much* better. It means the vertical and horizontal resolution are now known to be correctly setup, and it’s picking up hsync/vsync properly.
If we run v4l2-dbg against an older kernel, we can get a dump of the video decoder registers for comparison. In my case, I just went back to 3.0.0, which is known to work. From there it’s just a matter of iterating through the registers one at a time and watching the video. Fortunately, the video decoder core can be reprogrammed in real-time without having to stop and start streaming.
We found a couple of issues with the setup of the horizontal scaler, and tweaking that register eliminated the vertical bars:
root@isengard:~# v4l2-dbg -d /dev/video1 --chip=0x44 --set-register=0x418 0x00
root@isengard:~# v4l2-dbg -d /dev/video1 --chip=0x44 --set-register=0x419 0x00
as can be seen here:
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_svid_418419fix.png”][/lightbox]
Now we can see that the picture is properly formatted, but the color is still wrong. Some more poking around shows that the hue fields are improperly programmed. Again, two more register pokes…
v4l2-dbg -d /dev/video1 --chip=0x44 --set-register=0x420 0x82
v4l2-dbg -d /dev/video1 --chip=0x44 --set-register=0x421 0x82
and voila!
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_svid_420421fix.png”][/lightbox]
Now in the above cases, having the datasheet wasn’t essential. I just dumped the registers under both the 3.0 and current kernels, and then compared the registers one at a time, then watched the output as I sync’d them up. Of course it’s possible that multiple registers need to be programmed, so if you’re iterating from registers 0x400 to 0x500 and the video improved at register 0x418, then you may need to backtrack and make sure that changing 0x400 to 0x417 *didn’t* have any material effect.
Now if we toggle the input back to the tuner, we see some that the video is now properly formatted, but there is static and there is a complete lack of color:
[lightbox title=”HVR-1800 analog debugging, part 2″ href=”../../blog/wp-content/uploads/2012/05/1800_tuner_nochroma.png”][/lightbox]
We’ll get back to this problem, but at least we now have some idea how much effect fixing the video decoder registers had on the tuner input.
Of course, at this point we don’t actually have a fix. But now we understand which registers are being improperly programmed, and we can start digging into the code to see how they get into that state. It could be that the driver is failing to program them due to some exception condition which causes a function to return prematurely. It could be that the driver is simply setting them to garbage values. Or it could be some other explanation. But at least now we know what we’re looking for…
In our next episode, we’ll dig into what those registers actually do, and see if we can identify what logic in the driver causes them to go awry. We cannot simply change the values to the ones that work, as we need to understand *why* they are set as they are and what logic needs to be changed in order for the driver to work properly with both the HVR-1800 (which is currently broken) as well as the HVR-1850 (which currently works). This is one of the key challenges in having a single driver that supports multiple variants of the same chip – you need to come up with a single body of code which takes into account the subtle differences between the different chips.
It probably also makes sense to try to keep track of how much time is spent in the debugging effort (to offer some perspective as to how much work is actually involved). The content that formed the first post from last week took about three hours to setup, much of which was bootstrapping hardware, installing operating systems, downloading/compiling kernels, etc. Tonight’s post, I started the work about two hours ago (about half of which was doing the actual debugging and half was writing this post). Stay tuned!