Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

http://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_method

Based on the article description above, if we choose the option O_DIRECT, it describes as below:

O_DIRECT: InnoDB uses O_DIRECT (or directio() on Solaris) to open the data files, and uses fsync() to flush both the data and log files.

As option O_DIRECT means no\minimizing data would be cached in the OS page cache, however fsync() is used to flush the data from the page cache to device, so my question is why MySQL still use fsync() to flush both the data when the option is O_DIRECT?

Actually, the explanation is added in the documentation you linked in the paragraph following O_DIRECT option's description (highlighting is mine):

O_DIRECT_NO_FSYNC: InnoDB uses O_DIRECT during flushing I/O, but skips the fsync() system call afterward. This setting is suitable for some types of file systems but not others. For example, it is not suitable for XFS. If you are not sure whether the file system you use requires an fsync(), for example to preserve all file metadata, use O_DIRECT instead. This option was introduced in MySQL 5.6.7 (Bug #11754304, Bug #45892).

MySQL bug #45892 contains additional information:

Some testing by Domas has shown that some filesystems (XFS) do not sync metadata without the fsync. If the metadata would change, then you need to still use fsync (or O_SYNC for file open).

For example, if a file grows while O_DIRECT is enabled it will still write to the new part of the file, however since the metadata doesn't reflect the new size of the file the tail portion can be lost in the event of a crash.

Solution:

Continue to use fsync when important metadata changes or use O_SYNC in addition to O_DIRECT.

To sum it up: not using fsync() with certain file systems would cause MySQL to fail. However, MySQL offers the option from v5.6.7 to configure MySQL (well, innodb) tailored to your own OS' capabilities in this aspect by adding O_DIRECT_NO_FSYNC option.

Thank you Shadow, besides metadata, it seems that fsync() is used for volatile write-back cache of the Disk\Device: "Recall that the storage may itself store the data in a write-back cache, so fsync() is still required for files opened with O_DIRECT in order to save the data to stable storage" lwn.net/Articles/457667 YuFeng Shen Jan 4, 2017 at 1:50 You need O_DIRECT|O_SYNC for some FSes, but that's hideously slow, so writing async then doing fsync is often superior. Craig Ringer Apr 8, 2018 at 11:54

O_DIRECT skips OS cache but it does not ensure that data is persisted on disk. O_DIRECT writes only to drive write cache. Once drive write cache is disabled the rate falls down to fsync level. O_DIRECT could be a good option if drive write is crash safe (backed by a battery).

Check this blog for a very thorough analysis

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question . Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers .