Re: [RFC v1 00/11] Add iMX95 neoisp driver

From: Anthony McGivern <hidden>
Date: 2026-02-12 08:45:14
Also in: linux-devicetree, linux-media, lkml

Hi Laurent,


On 10/02/2026 16:02, Laurent Pinchart wrote:

On Tue, Feb 10, 2026 at 12:20:42PM +0000, Anthony McGivern wrote:

quoted

On 10/02/2026 00:20, Laurent Pinchart wrote:

quoted

On Mon, Feb 09, 2026 at 01:19:43PM +0000, Anthony McGivern wrote:

quoted

On 05/02/2026 09:40, Jacopo Mondi wrote:

quoted

On Wed, Feb 04, 2026 at 07:30:18PM +0100, Antoine Bouyer wrote:

quoted

Le 04/02/2026 à 18:12, Jacopo Mondi a écrit :

quoted

On Tue, Feb 03, 2026 at 07:37:34PM +0100, Jacopo Mondi wrote:

quoted

On Thu, Jan 29, 2026 at 12:00:24AM +0100, Michael Riesch wrote:

quoted

On 1/28/26 09:17, Antoine Bouyer wrote:

quoted

On 1/26/26 10:44 AM, Michael Riesch wrote:

quoted

On 1/23/26 09:09, Antoine Bouyer wrote:

[snip]

quoted

   - How many media devices are registered and which driver registers it
     or them?

That will be part of the evaluation. My initial assumption is that
neoisp would be the appropriate component to register the media device
in this mode, since ISI is not involved, and ISI currently performs the
registration in the M2M configuration.

Isn't the ISP registering its own media graph ?

Yes, 8 copies of ISP media graph, that can be used with the 8 output video
devices of the ISI media graph.

I suggest you do what RPi does. The mainline driver only registers one
instance and they carry a little patch downstream that implements the
for() loop where multiple instances are registered. Duplicating media graphs
is not desirable (at least in mainline) as we can have ISPs with 256
contexts, we don't want 256 media graphs.

A framework level solution with proper priority handling and job
scheduling is what is required and that's what the context work should
end up being.

Our Mali-C720 ISP can support up to 16 contexts, each with over a dozen
subdevs and capture nodes. As we imagine this will not be feasible for
upstreaming :) So using  this framework is definitely the way we would
like to go. We are mainly limited by the lack of per-context graph/streams
configuration at this point.

quoted

Can we get a copy of all media graphs on an i.MX95 system including
the ISI and the CSI-2 receiver ?

Here is an example with multiple sensors. Or do you need it in another
format ?

No it's fine, thanks!

quoted

digraph board {
        rankdir=TB
        n00000001 [label="{{<port0> 0 | <port1> 1 | <port2> 2 | <port3> 3 | <port4> 4} | crossbar\n/dev/v4l-subdev8 | {<port5> 5 | <port6> 6 | <port7> 7 | <port8> 8 | <port9> 9 | <port10> 10 | <port11> 11 | <port12> 12}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000001:port5 -> n0000000f:port0 [style=bold]
        n00000001:port6 -> n0000001a:port0 [style=bold]
        n00000001:port7 -> n00000025:port0 [style=bold]
        n00000001:port8 -> n00000030:port0 [style=bold]
        n00000001:port9 -> n0000003b:port0 [style=bold]
        n00000001:port10 -> n00000046:port0 [style=bold]
        n00000001:port11 -> n00000051:port0 [style=bold]
        n00000001:port12 -> n0000005c:port0 [style=bold]
        n0000000f [label="{{<port0> 0} | mxc_isi.0\n/dev/v4l-subdev9 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000000f:port1 -> n00000012 [style=bold]
        n00000012 [label="mxc_isi.0.capture\n/dev/video8", shape=box, style=filled, fillcolor=yellow]
        n0000001a [label="{{<port0> 0} | mxc_isi.1\n/dev/v4l-subdev10 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000001a:port1 -> n0000001d [style=bold]
        n0000001d [label="mxc_isi.1.capture\n/dev/video9", shape=box, style=filled, fillcolor=yellow]
        n00000025 [label="{{<port0> 0} | mxc_isi.2\n/dev/v4l-subdev11 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000025:port1 -> n00000028 [style=bold]
        n00000028 [label="mxc_isi.2.capture\n/dev/video10", shape=box, style=filled, fillcolor=yellow]
        n00000030 [label="{{<port0> 0} | mxc_isi.3\n/dev/v4l-subdev12 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000030:port1 -> n00000033 [style=bold]
        n00000033 [label="mxc_isi.3.capture\n/dev/video13", shape=box, style=filled, fillcolor=yellow]
        n0000003b [label="{{<port0> 0} | mxc_isi.4\n/dev/v4l-subdev13 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000003b:port1 -> n0000003e [style=bold]
        n0000003e [label="mxc_isi.4.capture\n/dev/video14", shape=box, style=filled, fillcolor=yellow]
        n00000046 [label="{{<port0> 0} | mxc_isi.5\n/dev/v4l-subdev14 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000046:port1 -> n00000049 [style=bold]
        n00000049 [label="mxc_isi.5.capture\n/dev/video21", shape=box, style=filled, fillcolor=yellow]
        n00000051 [label="{{<port0> 0} | mxc_isi.6\n/dev/v4l-subdev15 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000051:port1 -> n00000054 [style=bold]
        n00000054 [label="mxc_isi.6.capture\n/dev/video22", shape=box, style=filled, fillcolor=yellow]
        n0000005c [label="{{<port0> 0} | mxc_isi.7\n/dev/v4l-subdev16 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000005c:port1 -> n0000005f [style=bold]
        n0000005f [label="mxc_isi.7.capture\n/dev/video23", shape=box, style=filled, fillcolor=yellow]
        n00000067 [label="mxc_isi.output\n", shape=box, style=filled, fillcolor=yellow]
        n00000067 -> n00000001:port4 [style=bold]
        n0000006e [label="{{<port0> 0} | 4ac10000.syscon:formatter@20\n/dev/v4l-subdev17 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000006e:port1 -> n00000001:port2 [style=bold]
        n00000073 [label="{{<port0> 0} | csidev-4ad30000.csi\n/dev/v4l-subdev18 | {<port1> 1}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000073:port1 -> n0000006e:port0 [style=bold]
        n00000078 [label="{{<port0> 0 | <port1> 1 | <port2> 2 | <port3> 3} | max96724 2-0027\n/dev/v4l-subdev19 | {<port4> 4 | <port5> 5}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000078:port4 -> n00000073:port0 [style=dashed]
        n00000081 [label="{{} | mx95mbcam 8-0040\n/dev/v4l-subdev20 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000081:port0 -> n00000078:port0 [style=bold]
        n00000085 [label="{{} | mx95mbcam 9-0040\n/dev/v4l-subdev21 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000085:port0 -> n00000078:port1 [style=bold]
        n00000089 [label="{{} | mx95mbcam 10-0040\n/dev/v4l-subdev22 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
        n00000089:port0 -> n00000078:port2 [style=bold]
        n0000008d [label="{{} | mx95mbcam 11-0040\n/dev/v4l-subdev23 | {<port0> 0}}", shape=Mrecord, style=filled, fillcolor=green]
        n0000008d:port0 -> n00000078:port3 [style=bold]
}

This was an interesting point from our sides too regarding the context framework,
how would shared inputs be linked to independent contexts? For example, one input
port with 4 sensors where each is processed by a separate context.

If the multi-context ISP operates in M2M mode, the capture and ISP
pipelines will be disjoint (even if they're in the same media graphs).
Linking the two will be done by userspace, through memory buffers shared
between the pipelines.

In our ISP we don't have to operate in a pure M2M mode for multi-context.

Instead, we have a time division mode for multiple inline sensors simultaneously.
The context management unit writes incoming frames from multiple sensors to memory
buffers and automatically schedule them for processing, injecting the buffer into
the pipeline once it is available.

In the driver we just configure the context, provide internally allocated DMA buffers
and the scheduler automatically handles the rest. Of course we can get interrupts
for these events if we have use for them.

OK, so this is offline mode but with hardware (or firmware) scheduling.


Effectively yes. As you pointed out having an inline ISP operate with multi-context and
no buffering would have to be specialised. It's more the case here that it must be
configured once, then the ISP just provides interrupts on certain events.

quoted

While we could do this through userspace, it doesn't make full use of the ISP's
capabilities such as its hardware scheduling. From a media graph perspective, I
think it should be considered as an inline ISP with the buffers simply acting as
temporary storage while the ISP is busy.

Given that userspace will still have to supply parameters for the ISP,
as well as buffers for processed images, what would be the advantage of
scheduling the raw buffers automatically ?


It is mainly to reduce latency, since userspace parameters are written to per-context
memory regions rather than the ISP registers directly. The ISP automatically loads
the correct region in when the context starts. Effectively the IPA can be asynchronous
to the HW, though we make an assumption that the system is capable enough to run the IPA
and write the results to the context region during vertical blank as to ensure gains are
aligned with the sensor exposure.
The ISP can also still run without output video buffers, so we can still capture
statistics for frames we don't consume but maintain our IPA loop i.e. for auto exposure.
In Libcamera, the IPA loop is pretty much independent from requests, though still need to
figure out synchronising controls and metadata...

quoted

I guess my thought was the camera "frontend" would effectively have some shared
state across all contexts, but the outputs from this would go to per context instances.
Perhaps a similar thing would apply with this CSI-2 receiver and the ISI since they
appear to deal with multiple sensors that are then divided across their 8 contexts?

With a media graph that spans sensors, CSI-2 receivers, ISI and ISP, the
frontend part of the graph (sensors, CSI-2, ISI) would be handled with
one pipeline (in kernel terms, not a libcamera pipeline handler) and the
ISP with another pipeline. Those two pipelines would operate
independently. The ISP pipeline would make use of the multi-context API
while the frontend pipeline wouldn't.


This is one way we could go about it, and as we discussed DMA fences could
be utilized to still allow implicit HW scheduling while passing buffer from
sensor frontend to the ISP backend. Of course this introduces additional load
in waking the userspace, and potential concerns around error handling if the
input fails to capture a buffer.
One other concern I realized is our HW scheduler needs to be aware what inputs
are actually assigned to the context. With this separation in userspace that
could be difficult. Unless we make it explicit that each capture device on the
frontend ties to a specific context i.e. cap0 -> context0, etc? That may have
issues of it's own, like userspace now must know exactly which context it's using,
or potential misconfigurations not being validated and causing errors...


The other option we discussed being whether a media link could be made across
a single-context frontend and a multi-context backend, but naturally how this
may impact link validation, the APIs for interacting with this, etc. I guess
this would rely on first having media link states per context anyway.

quoted

As a test of multi-context with duplicated media graphs, we would segregate our
inputs between media devices, though this is less flexible as it strictly ties
one sensor to a particular context.

quoted

If I'm not mistaken you'll have 8 copies of the ISP media graphs, and
that's exactly what we're working on with the context framework :)

Ok. Then I should have a look to context framework too ...

Please, I hope to be able to resume working on it sooner or later
given the right use case.

quoted

... since it is not, your assumption seems very reasonable.

quoted

   - How can the user decide whether direct (csi2isp) or indirect
     (mem2mem) streaming shall be used?

That will also be part of the evaluation. From dts would be my first
option, but may prevent using both modes on same platform then.

Of course this depends what the hardware is able to do, but in case the
HW is reconfigurable easily, I doubt that device tree is a good choice
to solve that.

quoted

While it is certainly OK to introduce this support only at a later
stage, it makes sense to consider this right from the start to avoid
some nasty changes e.g. in how this hardware is exposed to user space.

Also, we are facing a similiar challenge with recent Rockchip ISP
hardware (RK3588, RK3576, ...) and it would be great to hear your
thoughts about that.

Is there an existing discussion thread available on this topic? I would
be very interested in following it.

Not yet, I am afraid. But there should be one or two soon (TM) :-)

It's probably time to have one :)

Good. Please loop me in ;)

You are in, this is the conversation ;)

It might be a good discussion point for the media summit in Nice
co-located with Embedded Recipes if people with interest in the topic
will going the be there.

I'm also adding Anthony from ARM as I know he's going through the same
inline/m2m duality you're now facing.

We make the issue even more complex as individual contexts can run in either
inline or m2m mode simultaneously... Though in our case the ISP does not
have any external dependencies for this like with Mali-C55 + IVC.

Simultaneously ? Can a single ISP instance run in inline and offline
mode simultaneously ? How does that work ?

Technically speaking the inline still require memory buffers but once configured
the ISP can run without involvement from the driver. The buffering is effectively
invisible at this point.

The context management unit facilitates this through the aformentioned hardware
scheduling. Each individual context may choose to use inline mode or M2M mode.
In inline mode, that context is "schedulable" when it's input buffer isready,
which occurs automatically once the image is fully written to memory.
In M2M mode, the context is "schedulable" when the user triggers it via SW.

quoted

As a side note, was there any thought into how Libcamera may support a pure m2m
usecase, say by passing user provided frames rather than indirectly coming from
a sensor? Perhaps there is already something for this that I've missed.

https://lists.libcamera.org/pipermail/libcamera-devel/2025-December/055627.html

I expect more work to be needed before we can finalize an API, as I
think different people will have very different ideas of how this should
work.

Ah nice thanks :)

I took a quick skim through and it seems pretty good. When I have some time
I will try pull this series to test on our side.

What are your use cases ?


For m2m it could be cases where we interact with an external sensor running
it's own auto exposure loop. Or we may even use it for passing precaptured
sequences of frames through the ISP as part of a sensor tuning process.
In this case, we could modify the behaviour of the IPA so we only process stats
once we are aware of the next request's controls i.e. user provided sensor exposure,
rather than running the IPA on frame end.


Thanks,
Anthony

quoted

This series is posted as RFC because extending the v4l2-isp interface may
overlap with ongoing work. If similar development already exists, I am
happy to rebase or adapt the series accordingly. If preferred, the series
can also be split into two parts: the v4l2-isp rework and the Neo ISP
driver introduction.

A few checkpatch warnings in v4l2-ioctl.c remain intentionally to stay
consistent with the existing style in that file.

Testing was performed on the i.MX95 EVK using the media/next kernel in
standalone M2M mode. End-to-end camera-to-ISP capture has been validated
using the downstream NXP kernel, as some hardware dependencies are not
yet upstreamed.

[snip]

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help