Thanks for sharing. However, this missed being a good
writeup due to lack of numbers and data.
I'll give a specific example in my feedback,
You said:
```
so far, so good, I was able to play with PyTorch and run Qwen3.6 on llama.cpp with a large context window
```
But there are no numbers, results or output paste.
Performance, or timings.
Anyone with ram can run these models, it will just be impracticably slow. The halo strix is for a descent performance, so you sharing numbers will be valuable here.
Thanks for sharing. However, this missed being a good writeup due to lack of numbers and data.
I'll give a specific example in my feedback, You said:
``` so far, so good, I was able to play with PyTorch and run Qwen3.6 on llama.cpp with a large context window ```
But there are no numbers, results or output paste. Performance, or timings.
Anyone with ram can run these models, it will just be impracticably slow. The halo strix is for a descent performance, so you sharing numbers will be valuable here.
Do you mind sharing these? Thanks!