ONNX Runtime and CoreML May Silently Convert Your Model to FP16

(ym2132.github.io)

40 points | by Two_hands 4 hours ago ago

4 comments

DiabloD3 3 hours ago ago

This is why I laugh at so called "AI researchers". They build "quality software" like this, while everyone else stops fucking around and uses ggml and llama.cpp and doesn't have these weird issues.
[-]
- noosphr 14 minutes ago ago
  
  While this is a bit too harsh - and the solution is naive at best - the problem is real.
  The idea of bitwise reproducibility for floating point computations is completely laughable in any part of the DL landscape. Meanwhile in just about every other area that uses fp computation it's been the defacto standard for decades.
  From NVidia not guaranteeing bitwise reproducibility even on the same GPU: https://docs.nvidia.com/deeplearning/cudnn/backend/v9.17.0/d...
  To frameworks being even worse. Every day getting further away from reproducibility. The best you can hope of is to order the frameworks in terms of how bad they are, with tensorflow being far down at the bottom and jax being (currently) at the top.
  This is a huge issue to anyone serious about developing novel models.
- omneity an hour ago ago
  
  Not until it gets tensor parallelism.
- ipython 3 hours ago ago
  
  Eh, those “ai researchers” are too busy rolling around in mounds of freshly minted Benjamins to care about “quality software”