owned this note
owned this note
Published
Linked with GitHub
# CAM API REDO
It's time to eliminate the union ccb from the CAM API. This document describes the API changes, how we'll do it in a mostly KBI compatible way, and how we'll manage the cut-over.
## Problem statement
There's places in the code that do unsafe things like pass a subset of a cam ccb, usually on the stack, into a call chain that shuffles it around as a union *ccb. This can be dangerout because only a subset of the union is valid in some cases. This is a historical practice that evolved when this sort of API was the best way to handle subclassing. Since then, the FreeBSD kernel has shifted paradigms abit. We now pass around the 'base class' rather than some super-set class that this union represents.
It would also be nice to have some way to check the casting to make sure that the right function is set in the ccb header for the desired target cast item (with no checking for casts to ccb_hdr and the union if it still remains). In C, checks like this would need to go through a macro, which gets a bit ugly and verbose (one of the unsettled items at present).
## Proposed Changes
Here's the changes that I think we need to make. Each one will be discussed in a subsection. We can use this list to prioritize the work and design a backwards compatible solution
### Ditch union ccb (really radically curtail)
There's 1500 instances of this in the tree, however, so doing so will be tricky. So we won't be able to do it since we have thousands of references to the union, so it won't die for a while.
More likely, we'll need to slowly cleanse the the tree of uses of union ccb * that aren't to get the size of the largest ccb. And it's used in a number of APIs.
Status: <span style="color:red">Still exploring the full impact, so more research needed</span>
### Change xpt_action() to take any kind of ccb pointer.
This is straight forward.
```
/*
* A generic to convert a pointer to any ccb to the ccb header including a ccb
* header. For ccb_hdr pointers, it uses the pointer directly. For all other
* types, we rely on the fact that there is a ccb_h element to that structure
* and we take the address of that element to get the ccb_hdr we pass to
* xpt_action. We don't list all possible CCBs here and rely on the default: to
* pick up new ones that are added and to indirectly enforce the requirement
* that the first element of all ccb's be the ccb header named ccb_h.
*/
#define cam_ccb_to_hdr(gccb) \
_Generic((gccb), \
struct ccb_hdr * : (gccb), \
default: &((gccb)->ccb_h))
void
-xpt_action(union ccb *start_ccb)
+_xpt_action(struct ccb_hdr *ccb_h)
{
- CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE,
- ("xpt_action: func %#x %s\n", start_ccb->ccb_h.func_code,
- xpt_action_name(start_ccb->ccb_h.func_code)));
+ CAM_DEBUG(ccb_h->path, CAM_DEBUG_TRACE,
+ ("xpt_action: func %#x %s\n", ccb_h->func_code,
+ xpt_action_name(ccb_h->func_code)));
- start_ccb->ccb_h.status = CAM_REQ_INPROG;
- (*(start_ccb->ccb_h.path->bus->xport->ops->action))(start_ccb);
+ ccb_h->status = CAM_REQ_INPROG;
+ (*(ccb_h->path->bus->xport->ops->action))((union ccb *)ccb_h);
}
-void xpt_action(union ccb *new_ccb);
+/*
+ * xpt_action is now a macro that takes a pointer to any of the ccb types,
+ * converts it to the ccb_hdr and calls _xpt_action
+ */
+void _xpt_action(struct ccb_hdr *ccb);
+#define xpt_action(new_ccb) _xpt_action(cam_ccb_to_hdr((new_ccb)))
```
These changes allow the rest of the tree to compile annd to slowly migrate to calling `xpt_action` with any ccb type. And would eventually allow us to prohibit `union ccb *`
Status: <span style="color:green">Implemented and Tested</span>
### Change xpt_done and xpt_done_direct to take any ccb pointer
Similar changes to `xpt_done` and `xpt_done_direct` were done and the rest of the tree compiled w/o any changes.
Status: <span style="color:green">Implemented and Tested</span>
### Change xpt_xport_ops.action to take ccb_hdr *
Since `xpt_action` calls the xport action routine, we needed to cast the `ccb_h` we had in hand to a `unionc ccb *`. Conversion of these routines is straight forward. However, that conversion suggests that we can propigate either `ccb_h` to the downstream from `foo_action` that's used for each transport or change them to use the correct ccb type that's sent down for that function.
Status: <span style="color:green">Implemented and Tested</span>
### downcasting
This is an unsolved problem. It would be nice to have a macro that could cast the ccb_h to the proper type for the function / use. A macro would also let us have an invariant verison that ensured that the length was correct for the conversion, and that the function matched the target type. This should allow for safer operation. It's tedious and error prone today with the free casting that we use, and the direct use of `&ccb.foo`to do the casting can't be instrumented to ensure that the `foo` is right for the function.
Status: <span style="color:red">Greenfield design</span>
### Change cam_sim_alloc
This one is a little uglier... xxx this seciton will be redone based one xpt_action
```
typedef void (*sim_action_func_union)(struct cam_sim *sim, union ccb *ccb);
typedef void (*sim_action_func_hdr)(struct cam_sim *sim, struct ccb_hdr *hdr);
#define cam_sim_alloc(a, b, c, d, e, f, g, h, i) \
_Generic((a),
sim_action_function_union: cam_sim_alloc_union(a, b, c, d, e, f, g, h, i),
sim_alloc_function_hdr: cam_sim_alloc_hdr(a, b, c, d, e, f, g, h, i))
```
but then we have two function pointers to marshall and how do we store that, call the right one, etc and be performant about that at runtime?
jhb: You could probably use function pointer casts in sim_alloc_function_union and make the internal function pointers always use the new API. imp: You can't portably cast a pointer to a function taking a foo arg to a pointer to a function taking a bar arg for similar reasons to what we have here (not sure I have a standard's based reason for this, but I do have some bloody knuckles from the 90s when I did this and ran into trouble). What we can do, though, is to have a wrapper that does the right thing so no casting is needed.
Status: <span style="color:orange">Design needs to be documented</span>
### Change xpt_alloc_ccb
jhb: Ideally you'd want to stop allocating unions and instead allocate CCBs with a given size. Warner noted that always allocating the full union can result in wasted space for smaller CCBs. It is probably a good idea in general to have dedicated UMA zones for commonly-used CCB types like ccb_scsiio, ccb_ataio, ccb_nvmeio, and ccb_mmcio. Otherwise, it might be best to change xpt_alloc_ccb to take a size_t argument. This can be shimmable via the C preprocessor using the same type of shims present for DRIVER_MODULE in stable/13 to make the devclass argument optional.
imp: I don't think that's the ideal we want to shoot for. I don't think we generally want to allocate ccb's of different sizes. There's code that uses the full size of different types during its life because it can't reliably allocate a new CCB after the first transaction completes, but it needs to send a CCB to do the next step of the state machine. It's way more efficient to just allocate the big ones from the slabs. Having a different size per would mean that we lose that. This is the last thing I really want to do: those optimizations already exist today via a specialize interface and it was tricky enough to get that right. And CCBs tend to be reused in places for naughty things so allocating 'just enough' in those cases would be nuts. I'd prefer that xpt_alloc_ccb return the biggest possible CCB.
So I'd prefer that we not change this interface at all as part of this process. If we're going to allocate specialized things for different purposes, I'd prefer new functions to do that since the size is insufficient.
Plus, there's code that does the scheduling that allocates a generic ccb and passes it into the periph as a way to control the number of in-flight outstanding I/Os to what the SIM can provide at the moment. There's some code to cope with this to use the uma zones right now, so this might not be too terrible for those... but not all periphs have been converted over, so they don't all set the right sizes of things that are needed for this to happen.
tl;dr: I really don't want to change that interface.
However, there may be value in having specailized interfaces that return ccb's that are for the purpose at hand. This could eliminate many of the on-stack CCBs we have today.
Status: <span style="color:red">No agreed path forward</span>
### CCB completion routines
The current ccb completion routines take a `union ccb *` which is something we're trying to eliminate. The problem is a little harder than the `cam_sim` case where we could use `_Generic` tricks to know the type of the function and then dispatch it to the right compat layer when the old interface is used. With these, the `.cbfncp` member is assigned directly.
Thankfully, these are by and large confined to the pariph drivers under `sys/cam` and two SIM drivers for sc. There's about 420 places that cbfcnp is referenced, but many are '0' for a null pointer, specify that as a parameter to a function, etc. There's only a few of these (about 10), except for the scsi_da.c driver, which has a dozen all by itself. so while this is referenced in a lot of places, the actual code change will be relatively minor (though will need to touch a lot of files in one commit).
Status: <span style="color:orange">Design OK, rework needed</span>
### Changes to ioctls
jhb: Would want to ensure that ioctl's either take an explicit size and pointer to ccb_header, or if the ioctl handlers already fetch the size from the ccb header they can just change from the union to the struct. If ioctl structures embed the union rather than via a pointer, they would need to change to be a pointer instead going forward. That will avoid breaking the ABI in the future if larger CCB types are added.
imp: ioctl isn't likely going to change initially. And changes to how the ioctl is called would be a lot harder to shim because ioctl's are too polymoprhic to do the _Generic hacks that we're contemplating elsewhere. Thankfully a lot of the code goes through wrapper functions. Some research is needed here.
Status: <span style="color:red">No agreed path forward</span>
### Compat Shims and KBI in stable
To preserve the KBI in stable branches, the old function symbols can be defined by defining wrappers around the old API after #undef'ing the wrapper macros. Eg
```
void _xpt_action(struct ccb_hdr *ccb_h) {}
#undef xpt_action
void xpt_action(union ccb *new_ccb) {_xpt_action(&new_ccb.ccb_h);}
```
But this is as yet untested, and we'd likely need the #undef + compat symbol shim at the end of the file. Maybe it's better to get this into 14 with the symbols we want, but the above could be done to 13 if we wanted to merge this work there to allow the new API there to coexist with the old, at least in part to make MFCs easier (we'd likely want to merge both SIM and PERIPH changes, but could forgoe some of the cleanup needed only to get to the goal of union ccb * elminiation).
Status: <span style="color:red">Research needed</span>
### Man pages
I'd say we'd need to update the man pages for these changes... If we had any man pages for them, that would be true. What's needed is to have the man pages written as we touch public or semi-public interfaces. I started down this path a long time ago, but gave up in favor of a UML doc. That didn't work out so well just yet (need to circle back with the docs folks to allow the right env in the handbook for the plug-ins that are needed).
In the mean time, we write man pages as we change interfaces, or find glaring holes in the man pages.
## Github branch
Warner has been pushing a github branch `cam-no-union` to his repo at `https://github.com/bsdimp/freebsd`.