Threat found
This web page may contain dangerous content that can provide remote access to an infected device, leak sensitive data from the device or harm the targeted device.
Threat: JS/Agent.RDW trojan
Unfortunately Apple appears to be blocking the use of these llms within apps on their app store.
I've been trying to ship an app that contains local llms and have hit a brick wall with issue 2.5.2
No it does not. This is about as “edge” as AI gets.
In a general sense, edge just means moving the computation to the user, rather than in a central cloud (although the two aren’t mutually exclusive, eg Cloudflare Workers)
It depends, because edge is a meaningless term and people choose what they want for it. In 2022, we set up a call with a vendor for ‘edge’ AI. Their edge meant something like 5kW, and our edge was a single raspberry pi in the best case.
For those who would like an example of its output, I'm currently working through creating a small, free (cc0, public domain) encyclopedia (just a couple of thousand entries) of core concepts in Biology and Health Sciences, Physical Sciences, and Technology. Each entry is being entirely written by Gemma 4:e4b (the 10 GB model.) I believe that this may be slightly larger than the size of the model that runs locally on phones, so perhaps this model is slightly better, but the output is similar. Here is an example entry:
It's highly coherent (see my other comment for an example of its text output) and yes, it's useful. I am starting to use Gemma 4:e4b as my daily driver for simple commands it definitely knows, things that are too simple to use ChatGPT for. It is also able to code through moderately difficult coding tasks. If you want to see it in action, I posted a video about it here[1] (the 10 GB one is at the 2 minute mark and the 20 GB one says hello at 5 minutes 45 seconds into the video.) You can see its speed and output on simple consumer grade hardware, in this case a Mac Mini M4 with 24 GB of RAM.
It can write (some) code that works. Just roughly guessing from my use, but I think of it as being a bit like ChatGPT circa-2024 in terms of capability & speed.
Disappointing if you compare it to anything else from 2026, but fairly impressive for something that can run locally at an OK speed.
I find it fascinating that after all this time reporters still don’t even bother to proofread content for obvious AI tells. I guess nobody really cares anymore?
That doesn't answer the question, I'm curious too. I think there's a speed and battery advantage on the A19 Pro chip compared to the Snapdragon 8 Elite Gen 5 chip, but to know for sure one has to run the same model used in the most efficient way on both machines (flagships ios and android).
ESET is blocking this site saying:
Threat found This web page may contain dangerous content that can provide remote access to an infected device, leak sensitive data from the device or harm the targeted device. Threat: JS/Agent.RDW trojan
Related: Gemma 4 on iPhone (254 comments) - https://news.ycombinator.com/item?id=47652561
Another related submission from 22 days ago : iPhone 17 Pro Demonstrated Running a 400B LLM (+700pts, +300cmts): https://news.ycombinator.com/item?id=47490070
There are many apps to run local LLMs on both iOS & Android
Unfortunately Apple appears to be blocking the use of these llms within apps on their app store. I've been trying to ship an app that contains local llms and have hit a brick wall with issue 2.5.2
It runs on Android too, with AI Core or even with llama.cpp
> edge AI deployment
Isn't the "edge" meant to be computing near the user, but not on their devices?
No it does not. This is about as “edge” as AI gets.
In a general sense, edge just means moving the computation to the user, rather than in a central cloud (although the two aren’t mutually exclusive, eg Cloudflare Workers)
It depends, because edge is a meaningless term and people choose what they want for it. In 2022, we set up a call with a vendor for ‘edge’ AI. Their edge meant something like 5kW, and our edge was a single raspberry pi in the best case.
Your device is the ultimate edge. The next frontier would be running models on your wetware.
Not just running it on your wetware, but charging you for it.
Can't wait until AI companies go from mimicking human thought to figuring how to licensing those thoughts. ;)
Man can't wait for AI in my brain. And then intelligence will be pay to win.
For those who would like an example of its output, I'm currently working through creating a small, free (cc0, public domain) encyclopedia (just a couple of thousand entries) of core concepts in Biology and Health Sciences, Physical Sciences, and Technology. Each entry is being entirely written by Gemma 4:e4b (the 10 GB model.) I believe that this may be slightly larger than the size of the model that runs locally on phones, so perhaps this model is slightly better, but the output is similar. Here is an example entry:
https://pastebin.com/ZfSKmfWp
Seems pretty good to me!
Is the output coherent though? I am yet to see a local model working on consumer grade hardware being actually useful.
It's highly coherent (see my other comment for an example of its text output) and yes, it's useful. I am starting to use Gemma 4:e4b as my daily driver for simple commands it definitely knows, things that are too simple to use ChatGPT for. It is also able to code through moderately difficult coding tasks. If you want to see it in action, I posted a video about it here[1] (the 10 GB one is at the 2 minute mark and the 20 GB one says hello at 5 minutes 45 seconds into the video.) You can see its speed and output on simple consumer grade hardware, in this case a Mac Mini M4 with 24 GB of RAM.
[1] https://youtube.com/live/G5OVcKO70ns
I run qwen3.5 122b on a Framework Desktop at 35/ts as a daily driver doing security and OS systems and software engineering.
Never paid an LLM provider and I have no reason to ever start.
What spec of Framework Desktop do you run this on?
There is only one and for this model you need the one with 128GiB RAM.
Qwen3.5-9b and Qwen3.5-27b are pretty coherent on my 24G android phone
Which android phone has 24G?
I can try it for you
It can write (some) code that works. Just roughly guessing from my use, but I think of it as being a bit like ChatGPT circa-2024 in terms of capability & speed.
Disappointing if you compare it to anything else from 2026, but fairly impressive for something that can run locally at an OK speed.
Can we please ban content that is CLEARLY written by AI?
I find it fascinating that after all this time reporters still don’t even bother to proofread content for obvious AI tells. I guess nobody really cares anymore?
That bugged me too, so I started looking at other articles - they all look AI generated to me. Whole website should be banned.
is there a comparison of it running on iPhone vs. Android phones?
You can run Android on just about anything so it boils down to Linux GPU benchmarks.
That doesn't answer the question, I'm curious too. I think there's a speed and battery advantage on the A19 Pro chip compared to the Snapdragon 8 Elite Gen 5 chip, but to know for sure one has to run the same model used in the most efficient way on both machines (flagships ios and android).