Demo of Tether's QVAC running local inference on a mobile device at incredible speed, via llama.cpp + LLAMA 3.2 1B parameters. QVAC is a generalized inference and fine-tuning runtime able to adapt to any device, from smartphones, to laptops and servers. Lots of models being supported already. More to come. No limits. Infinite intelligence. Coming soon
38,93K