In short: Loss of data due to a backup error could mean worries about the loss of years of personal photos or, in the case of Kyoto University of Japan, the loss of 77 TB of critical research data. The incident involved a university supercomputer that received a faulty software update for its backup system, resulting in 34 million files being accidentally erased over a two-day period.
The culprit behind this massive data loss was a flawed script originally designed to remove old unnecessary log files from Kyoto University’s Cray / HPE supercomputer as part of a software update. However, between December 14-16, 2021, a huge amount of research data (77 TB) was deleted from the computer’s large-capacity / LARGE0 backup disk.
The university originally estimated the loss of up to 100 TB of data after an erroneous update erased almost all files older than 10 days. The 77 TB of research data that was effectively deleted contained 34 million files that affected 14 research teams. Although Kyoto University did not disclose the nature or details of the deleted research data, it marked (Japanese) that files belonging to 4 groups cannot be recovered.
The university’s supercomputer supplier, Hewlett Packard Japan (HPE), admitted 100 percent responsibility for the incident and issued a letter of apology later published by the university. HPE said its update released a modified script “to improve visibility and readability,” according to The Stack. reports…
However, HPE said it was unaware of the side effects of this behavior, which caused a modified shell script to be reloaded in mid-execution, resulting in “undefined variables” and deleting files on the supercomputer’s / LARGE0 backup disk.
Kyoto University has since paused the backup process as it hopes to make improvements and add preventive measures to deal with such incidents in the future. In addition to mirrored backups, the university also plans to support incremental backups after resuming the backup program later this month.