Anything can be done by a core or external developer, there’s just degrees of how much time people want to invest. It’s hard to estimate how much that is without knowing the developer.
For hair and particles, the least work would be to wait for new particle and hair systems. The new data structures have plain arrays of floats and don’t need any special conversion. A change that could be made sooner would be to add a utility function to convert the old particles to new hair and point cloud data structures. That will be useful in general, at a minimum for backwards compatibility.
For meshes ideally internal data structures would also move to such plain arrays, but that’s very hard.
For instances, it’s possible to add an API function that returns a flat array of all instances somehow, but this may need some deeper changes as some data gets invalidated after every iteration.
A C++ API in general has a big maintenance cost, and so would also require a commitment to maintain it.
For image loading, Blender itself does not support doing this in parallel, there is a mutex lock for that so even doing it in the background would not really work. This could be solved with some refactoring though.
I don’t know if async loading APIs are the right solution. You could imagine this for all kinds of operations, image loading, volume loading, generating instances, … and adding an async API for each potentially expensive operation seems not great to me. I think it would be better to use mutlithreading here, I think that’s possible if the Python GIL is released while the image loading happens.