Patrick and I went for an ice cream, and two hours later ... a potential proposal:
User visible end state: make all data types (whether based on numpy or arrow) use NA as (visible) missing value sentinel with NA/nullable semantics (i.e. propagation of NA in comparisons and kleene logic in boolean operations)
Implementation steps:
Make ExtensionArrays fully support 2D (and make everything use ExtensionArray/Dtype, also the numpy based ones)
Change the pd.NA scalar to be less annoying (probably mostly bool(pd.NA) not raising?)
Fix conversion to numpy to not use object dtype
Only use NA for the masked (numpy-based) Floating dtype (so don't allow NaN to be present, and thus no need to distinguish both, nan could be present, but would be hidden by the mask)This makes conversion numpy <-> pandas clearer (numpy only has NaN, pandas only has NA, so the conversion is on input/output is unambiguous)