> 1. Both systems run with the same fixed input files (data/accounts.dat, data/txns.dat).
> 2. Each writes its results to out/accounts_out_*.dat.
> 3. Python scripts convert fixed-width output to CSV and compute SHA-256 checksums.
> 4. If the hashes match — behavior is proven identical.
Step 3 above introduces the possibility that the python scripts alter the output in such a way that the outputs don't actually match prior to the python.
I'm curious why step 3 is not "If the two outputs match — behavior is proven identical."
> This enduring reliance exists not out of nostalgia, but necessity: COBOL’s reliability, stability, and the prohibitive cost and risk of replacing decades of deeply integrated logic make it one of the most mission-critical technologies ever built.
That sentence struck me as odd. Is COBOL any more "reliable" or "stable" than any other language? I'm no COBOL expert, but when I've looked at it and read about how it works, it seems rather verbose and mundane. That's not unexpected; it was developed in a different era with different sensibilities.
Historically, COBOL lacked dynamic memory allocation-all data structures were fixed size and allocated at program startup. Although COBOL now has the equivalent of malloc/free, its long-time absence encouraged a coding style of using it sparingly-which does make a whole class of bugs less common in COBOL programs
Yes no dynamic memory allocation, however there still are many ways to ABEND your COBOL program. The reliability aspect comes from the fact that these systems have been running for 40+ years, and places where it could have ABEND'd probably have been fixed [hopefully].
The problem I have with all Cobol translation models is that it completely ignores the actual modernization of the system. You've traded one type of syntactic sugar with another.
I think they mean that "COBOL" is often used as a synonym for old mainframe based software. The language isn't the biggest issue with such systems, usually. Any programmer can learn COBOL, just translating one syntax to another doesn't buy you much. It's also about the hardware the stuff runs on, the database systems, the job schedulers, etc.
I’ve been experimenting with formal, verifiable modernization and taking a small COBOL batch program and translating it through an intermediate representation and Alloy formal model into Kotlin, while proving equivalence with the legacy output.
> 1. Both systems run with the same fixed input files (data/accounts.dat, data/txns.dat).
> 2. Each writes its results to out/accounts_out_*.dat.
> 3. Python scripts convert fixed-width output to CSV and compute SHA-256 checksums.
> 4. If the hashes match — behavior is proven identical.
Step 3 above introduces the possibility that the python scripts alter the output in such a way that the outputs don't actually match prior to the python.
I'm curious why step 3 is not "If the two outputs match — behavior is proven identical."
From the article:
> This enduring reliance exists not out of nostalgia, but necessity: COBOL’s reliability, stability, and the prohibitive cost and risk of replacing decades of deeply integrated logic make it one of the most mission-critical technologies ever built.
That sentence struck me as odd. Is COBOL any more "reliable" or "stable" than any other language? I'm no COBOL expert, but when I've looked at it and read about how it works, it seems rather verbose and mundane. That's not unexpected; it was developed in a different era with different sensibilities.
Historically, COBOL lacked dynamic memory allocation-all data structures were fixed size and allocated at program startup. Although COBOL now has the equivalent of malloc/free, its long-time absence encouraged a coding style of using it sparingly-which does make a whole class of bugs less common in COBOL programs
Yes no dynamic memory allocation, however there still are many ways to ABEND your COBOL program. The reliability aspect comes from the fact that these systems have been running for 40+ years, and places where it could have ABEND'd probably have been fixed [hopefully].
Having COBOL sources which match whats running in production is a load bearing assumption :).
The problem I have with all Cobol translation models is that it completely ignores the actual modernization of the system. You've traded one type of syntactic sugar with another.
you mean cobol 2002+ revisions ?
I think they mean that "COBOL" is often used as a synonym for old mainframe based software. The language isn't the biggest issue with such systems, usually. Any programmer can learn COBOL, just translating one syntax to another doesn't buy you much. It's also about the hardware the stuff runs on, the database systems, the job schedulers, etc.
I’ve been experimenting with formal, verifiable modernization and taking a small COBOL batch program and translating it through an intermediate representation and Alloy formal model into Kotlin, while proving equivalence with the legacy output.
Repo: https://github.com/marcoeg/cobol-modernization-playbook
Would love feedback from people who’ve worked on reverse engineering or legacy transformations at scale.
> formal, verifiable modernization
Would it be possible to do the same to modernize a Kotlin program becoming legacy in the future to something even more modern?
how are you creating the IR?
Isn't the first code sample pasted in there twice?
Yes, starting at:
Then the code repeats.Source code: https://github.com/marcoeg/cobol-modernization-playbook