Randomly found it
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine - slavabarkov/tidy
The app is in this curated list of great Android FOSS apps.
Here