Didn't someone just make a post about a game stream server that would allow multiple gamers to use the same machine? Not with VMs, but multiple users and virtual displays.
You'd connect to it via any moonlight client, and it creates the environment for you to use the machine for whatever.
Why use commercial graphics accelerators to run a highly limited "AI"-unique work set? There are specific cards made to accelerate machine learning things that are highly potent with far less power draw than 3090's.
Not if it's for inference only. What do you think the "AI accelerators" they're putting in phones now are? Do you think they'd be as expensive or power hungry as an entire 3090 for performance if they were putting them in small devices?
Would you link one? Because the only things I know of are the small coral accelerators that aren't really comparable, and specialised data centre stuff you need to request quotes for to even get a price, from companies that probably aren't much interested in selling one direct to customer.