SAA7164 New Patches

I’ve posted a new set of patches to the saa7164-dev tree. The significant change relates to how the driver deals with command queueing in an effort to resolve commands that have appeared to ‘timeout’, often seen as I2C message failures. One of the positive side effects is that these patches also lower the overall system cpu time and the platform feels much more responsive under a heavy load.

I’ve just begun testing these patches in a private tree and initial indications are very good. Feel free to try these patches if you’re experiencing bad behavior from youe HVR2200/50 driver.

As always, comments and feedback are welcome.

Update: These patches are now merged into the stable tree.

12 thoughts on “SAA7164 New Patches

  1. Steve,
    I have good news!! This new dev release has solved my I2C problem!!
    Thankyou so much for your assistance with this, i cant tell you how grateful i am!

    Regards,
    Nasha

  2. A quick note to say these patches ran all day yesterday for me, streaming from digital cable and hammering on the I2C interfaces 10 times the normal speed – causing an excessive and un-normal workload in the system, and it’s still stable as a rock.

    Late last night I terminated those tests and started MythTV. I left it watching LiveTV and background recording my normal production schedule. This morning it was still as smooth and stable as a rock.

    The patches will likely go into -stable in the next 24-48 hours.

  3. Steven,
    The comments about firmware not responding (acking a command?) for a very long time
    looks very familiar, I ran across this with the cx18 driver. I found the failure mode was
    actually that a wakeup() doesn’t necessarily wake up a scheduled() thread right away.
    Depending on the system load, it could be a very long time before the process gets back
    to running. The occasional long wait was causing application playback to not be
    smooth, since I would try to give buffers back to the firmware in the application
    read() when it had drained a buffer. The solution was to give the empty buffers back to
    the firmware in a worker thread so the application wasn’t schedule()’ed during it’s read()
    call to the driver when there was data available.

  4. @Andy, Yeah – I’d suspected that the worker behind wake_up would be subject the normal scheduling rules and therefore it could be a while before it was triggered. For references I’d put those comments in the following patch https://www.kernellabs.com/hg/~stoth/saa7164-dev/rev/1d5c77a143cb as part of the timeout changes. Oddly, that only solved part of the overall problem. The larger issue was related to the fact that the interrupt (acking the execution of a firmware command) would not always fire when expected under heavy load. The real solution was for me to actually check the state of the bus during deferred processing and signal waiters accordingly.

    As always, thanks for your post, it’s good to hear from developers with similar circumstances.

  5. @Steven,
    > The larger issue was related to the fact that the interrupt (acking the execution of a
    >firmware command) would not always fire when expected under heavy load.

    Yeah, I had something similar with the CX23418. It turns out that it was not the CX23418’s
    fault, nor the cx18 driver’s fault. The CX23418 was sharing an interrupt line with a
    disk controller handled by the ahci driver, IIRCr. That driver’s ISR can sometimes take a
    really long time and would essentially mask the CX23418’s IRQ long enough for the
    CX23418’s firmware to give up waiting for the cx18 driver and move on. The result
    was lost acks and incoming buffer transfers.

    Not sure if any of that information helps here…

    -Andy

  6. @Andy, interesting. That’s something I’d privately speculated upon after spending a large number of hours logged into an Australian users system trying to debug some odd issues. I was seeing cases where an expected interrupt did not fire, or triggered (apparently) very late. In the customers case they had nvidia, usb controller, intel HD audio controller and saa7164 all registered for the same IRQ, and the bios would not let them change this.

    It wasn’t clear to me where my instance of the saa7164’s interrupt handler was in the kernels internal chain, and I suspected something higher in the chain was being a bad player. I had no evidence of this other than ‘Works for everyone else’.

    I made a mental note to develop some simple-to-add debugging only patches to find the appropriate kernel structures (query/display the irq calling order), just for fun. It’s nothing more than a mental note at this stage. Also to generate a reasonable generic set of statistics functions to collect and gather time/execution/delay related information from a running driver.

    …. not enough hours in the day. 🙂

Leave a Reply