I think what would matter from this kind of measure is whether a project's use of unsafe actually has undefined behavior. Like the number of unsafe blocks is not really my concern as much as what the unsafe blocks are doing. If you build a single faulty abstraction via unsafe, anything that uses it is broken.
In my projects, it usually comes down to a scenario like needing to write inline assembly or invoke a foreign function, where there are close to zero guarantees the language can give me.
I agree that unsafe isn’t evil and shouldn’t be “avoided at all costs”, especially when using unsafe could be eg eliminated by the compiler (very common usage, actually!) or give you far superior codegen or code complexity.
But test coverage of unsafe blocks is not a meaningful metric. The best automated solution is standalone Miri runners exercising all branches of the code (via tests or otherwise) because tests on their own won’t catch things like out of counts reads or heap corruption unless you get lucky.
I think what would matter from this kind of measure is whether a project's use of unsafe actually has undefined behavior. Like the number of unsafe blocks is not really my concern as much as what the unsafe blocks are doing. If you build a single faulty abstraction via unsafe, anything that uses it is broken.
In my projects, it usually comes down to a scenario like needing to write inline assembly or invoke a foreign function, where there are close to zero guarantees the language can give me.
I worry a little that perfectly cromulent Rust will get a bad name when the culture tends towards “unsafe is bad.”
Is there real value in these statistics vs. an approach where the measure is test coverage of unsafe blocks?
I agree that unsafe isn’t evil and shouldn’t be “avoided at all costs”, especially when using unsafe could be eg eliminated by the compiler (very common usage, actually!) or give you far superior codegen or code complexity.
But test coverage of unsafe blocks is not a meaningful metric. The best automated solution is standalone Miri runners exercising all branches of the code (via tests or otherwise) because tests on their own won’t catch things like out of counts reads or heap corruption unless you get lucky.
I agree about test coverage. I’d say it’s less bad but still doesn’t necessarily mean anything rigorous.
Short of formal verification, which I think is often going to be unreasonable, we generally have a spectrum of “less bad” options.