I have recently been playing with llamafiles, particularly Llava which, as far as I know, is the first multimodal open source llm (others might exist, this is just the first one I have seen). I was having it look at pictures of prospective houses I want to buy and asking it if it sees anything wrong with the house.
The only problem I ran into is that window 10 cmd doesn't like the sed command, and I don't know of an alternative.