Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's true and untrue depending on how you look at it. Flash memory only supports changing/"writing" bits in one direction, generally from 1 to 0. Erase, as a separate operation, clears entire sectors back to 1, but is more costly than a write. (Erase block size depends on the technology but we're talking MB on modern flash AFAIK, stuff from 2010 already had 128kB.)

So, the drives do indeed never "overwrite" data - they mark the block as unused (either when the OS uses TRIM, or when it writes new data [for which it picks an empty block elsewhere]), and put it in a queue to be erased whenever there's time (and energy and heat budget) to do so.

Understanding this is also quite important because it can have performance implications, particularly on consumer/low-end devices. Those don't have a whole lot of spare space to work with, so if the entire device is "in use", write performance can take a serious hit when it becomes limited by erase speed.

[Add.: reference for block sizes: https://www.micron.com/support/~/media/74C3F8B1250D4935898DB... - note the PDF creation date on that is 2002(!) and it compares 16kB against 128kB size.]



> Understanding this is also quite important because it can have performance implications

Security implications too. The storage device cannot be trusted to securely delete data.


If you write whole drive capacity of random data, you should be fine.


No. Say a particular model of SSD has over-provisioning of 10%, then even after writing the "whole" capacity of the drive, you can still be left with up to 10% of data recoverable from the Flash chips.


Right, so one better write 2x or 10x drive capacity of random data to it.


You should be running flash with self-encryption (and make sure you have a drive that implements that correctly).

To zap a drive you ask it to securely drop the self-encryption key. The data will still be there, but without the key it is indistinguishable from random noise.


Well who has time and energy to verify that. Just overwrite it several times, or destroy the drive.


For some family photos? Probably. For sensitive material or crypto keys? Absolutely not, due to overprovisoning as mentioned (which can be way higher than 10% for enterprise drives), but also due to controllers potentially lying to you especially when drives have things like pSLC caches, etc.


By any reasonable definition they do overwrite data. It's just that they can't overwrite less than a block of data.


If a logical overwrite only involved bits going from 1 to 0, are and drives smart enough to recognize this and do it as an actual overwrite instead of a copy and erase?


On embedded devices, yes, this is actually used in file systems like JFFS2. But in these cases the flash chip is just dumb storage and the translation layer is implemented on the main CPU in software. So there's no "drive" really.

On NVMe/PC type applications with a controller driving the flash chips… I have absolutely no idea. I'm curious too, if anyone knows :)


I do know. Apparently you downvoted my sibling response to you as too simplistic, but I was clearly responding to someone where the embedded bare drive situation is irrelevant.

When it comes to what non bare flash drives do, you can start here: http://www.vldb.org/pvldb/vol13/p519-kakaraparthy.pdf

This paper is imperfect and the following citations are worth skimming. There's a cohort of similar papers chasing the same basic question in recent years that aren't densely cited amongst each other.

Go here next: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46... but note that's just a jumping off point to the more recent papers.

It's hard to gain a full understanding of this layer because it's the basis of intense competition, hence held closely by controller manufacturers.

I'm far from world expert on this, but have read a lot about it and can answer with what I know to the best of my ability.


> Apparently you downvoted my sibling response to you as too simplistic,

I didn't downvote your sibling response, but I did ignore it since it provided neither any sources nor any context for why I should trust your knowledge. Apparently others were less kind on your short statement.

With the additional information in this post, I'm much more willing to accept it into my head — thanks for answering this!


Yeah sorry that was unnecessarily grouchy of me.


Generally no, because the unit of write is a page.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: