atomic operations

Hi

I need to know, which operation of a hdd can be seen as atomic operation (we try to design a transactional database). Someone told me, that a sector is always completely written or not written. Is that correct? Can it be guaranteed, that during a power loss this is also true? Are hdds capable of detecting such conditions or might it be, that in such a case the disk writes rubbish onto some sectors (maybe, because the controller was sending data and when the power went away it caused that the controller started sending usefull data).

It's really hard to find out how all works... someone else told me, that there is no guarantee for anything and that it might even be possible, that the head is still writing while returning to its parking position and thus destroying data. But I can't imagine this... is it really so unsafe? ... (by the way, we are NOT talking about damanged hdds).

I hope, you can give me some information about this...
Rudolf

My thoughts for many years is that the drive always has power to complete final write and confirm to host.

Why not test it ;o)

Sorry for the very late response.

There are very, very few real guarantees. While the 'writing rubbish' scenario which you describe should not be possible with a well-designed drive (and would in any event be caught with extremely high probability by the correction codes in the individual sectors affected), one possibility which is beyond the disk's ability to guard against is that if (e.g.) power is failing then the rest of the system may die before the disk has finished transferring and writing data and thus will THINK it is receiving good data and write it even though it is receiving garbage (this has actually happened in Unix environments where existing inodes were lost because the single sector being updated got partially zeroed out).

If you're using a transaction log (or anything analogous), its writes must be protected by strong spanning internal checksums (since it's also possible to have both the beginning and the end of a multi-sector transfer complete while some part in the middle never did) and no active portion of the log can ever be over-written.

Edit: More general concerns include protecting against 'lost' writes that appear to complete but actually did nothing and 'wild' writes that wrote to some location other than the intended one without returning an error status. In both cases the system continues merrily on its way having corrupted the database, and you need to provide for a) discovering this corruption the next time the affected data is accessed and b) fixing it.

Writing really robust databases is very hard work and requires a great deal of knowledge. Meaning no disrespect, but it sounds as if you need a lot more experience before undertaking such a task.

Ahhh databases and atomics.....

In theory the transaction either "commits" or "rolls back"

but consider at what stage is the data considered "written" , The issue is that the database can consider it is "committed" , but it is actually held in a volatile ram buffer waiting to be written. a "commit" IS NOT the same as a guarantee of being physically written to the disk surface. There is NO WAY a database can know what the attached storage is.

1. the disk buffer in the computer
2. the ram buffer in the computer
3. the ram buffer in the drive
4. the data is on the disk surface

to understand this you need a very good grounding in database design , have a look on the "oracle" forums , you will see that even with an oracle database it makes extensive use of disk buffers and ram buffers, they even implemented their own file system that wrote directly to the sectors of the disk drive without intervening OS calls.

When a database reads data there is NO flag telling it if the data was from :
a disk buffer
a drive cache
a disk surface
a NAS system.

They get round this by using something called a "transaction log" , this is a file that contains the current state of every modification to the database , if the system crashes during a write this log will contain the information and the "commit" status, if you are really really lucky after bringing the system back up you can "roll forward" and the log will exactly match your data files, saying the state of the database is intact.
if you are "unlucky" ,I'm afraid it is time for a programmer to go into the database and try and tidy up the mess, this is potentially a bigger problem if you have multi-site replication, where databases allover the world are being synchronized, since the other databases continue to process and get ahead of the "crashed" database, it is also very dependent on how well written the applications that use the database are, a good database cannot make up for a crap application that holds multiple record "locks" over an extended time period.

As such there is a requirement to be able to clearly differentiate mentally between a "commit" and a physical sector write, again..... they are NOT the same thing.

atomic operations

Forum rules

atomic operations

Re: atomic operations

Re: atomic operations

Re: atomic operations