相关文章推荐
爱热闹的竹笋  ·  Windows.h 文件学习 - ...·  1 年前    · 
潇洒的小虾米  ·  Socket.Send 方法 ...·  1 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

Is there are way to uniquely identify a file (and possibly directories) for the lifetime of the file regardless of moves, renames and content modifications? (Windows 2000 and later). Making a copy of a file should give the copy it's own unique identifier.

My application associates various meta-data with individual files. If files are modified, renamed or moved it would be useful to be able to automatically detect and update file associations.

FileSystemWatcher can provide events that inform of these sorts of changes, however it uses a memory buffer that can be easily filled (and events lost) if many file system events occur quickly.

A hash is no use because the content of the file can change, and so the hash will change.

I had thought of using the file creation date, however there are a few situations where this will not be unique (ie. when multiple files are copied).

I've also heard of a file SID (security ID?) in NTFS, but I'm not sure if this would do what I'm looking for.

Any ideas?

Here's sample code that returns a unique File Index.

ApproachA() is what I came up with after a bit of research. ApproachB() is thanks to information in the links provided by Mattias and Rubens. Given a specific file, both approaches return the same file index (during my basic testing).

Some caveats from MSDN:

Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.

In the FAT file system, the file ID is generated from the first cluster of the containing directory and the byte offset within the directory of the entry for the file. Some defragmentation products change this byte offset. (Windows in-box defragmentation does not.) Thus, a FAT file ID can change over time. Renaming a file in the FAT file system can also change the file ID, but only if the new file name is longer than the old

In the NTFS file system, a file keeps the same file ID until it is deleted . You can replace one file with another file without changing the file ID by using the ReplaceFile function. However, the file ID of the replacement file, not the replaced file, is retained as the file ID of the resulting file.

The first bolded comment above worries me. It's not clear if this statement applies to FAT only, it seems to contradict the second bolded text. I guess further testing is the only way to be sure.

[Update: in my testing the file index/id changes when a file is moved from one internal NTFS hard drive to another internal NTFS hard drive.]

    public class WinAPI
        [DllImport("ntdll.dll", SetLastError = true)]
        public static extern IntPtr NtQueryInformationFile(IntPtr fileHandle, ref IO_STATUS_BLOCK IoStatusBlock, IntPtr pInfoBlock, uint length, FILE_INFORMATION_CLASS fileInformation);
        public struct IO_STATUS_BLOCK
            uint status;
            ulong information;
        public struct _FILE_INTERNAL_INFORMATION {
          public ulong  IndexNumber;
        // Abbreviated, there are more values than shown
        public enum FILE_INFORMATION_CLASS
            FileDirectoryInformation = 1,     // 1
            FileFullDirectoryInformation,     // 2
            FileBothDirectoryInformation,     // 3
            FileBasicInformation,         // 4
            FileStandardInformation,      // 5
            FileInternalInformation      // 6
        [DllImport("kernel32.dll", SetLastError = true)]
        public static extern bool GetFileInformationByHandle(IntPtr hFile,out BY_HANDLE_FILE_INFORMATION lpFileInformation);
        public struct BY_HANDLE_FILE_INFORMATION
            public uint FileAttributes;
            public FILETIME CreationTime;
            public FILETIME LastAccessTime;
            public FILETIME LastWriteTime;
            public uint VolumeSerialNumber;
            public uint FileSizeHigh;
            public uint FileSizeLow;
            public uint NumberOfLinks;
            public uint FileIndexHigh;
            public uint FileIndexLow;
  public class Test
       public ulong ApproachA()
                WinAPI.IO_STATUS_BLOCK iostatus=new WinAPI.IO_STATUS_BLOCK();
                WinAPI._FILE_INTERNAL_INFORMATION objectIDInfo = new WinAPI._FILE_INTERNAL_INFORMATION();
                int structSize = Marshal.SizeOf(objectIDInfo);
                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);
                IntPtr res=WinAPI.NtQueryInformationFile(fs.Handle, ref iostatus, memPtr, (uint)structSize, WinAPI.FILE_INFORMATION_CLASS.FileInternalInformation);
                objectIDInfo = (WinAPI._FILE_INTERNAL_INFORMATION)Marshal.PtrToStructure(memPtr, typeof(WinAPI._FILE_INTERNAL_INFORMATION));
                fs.Close();
                Marshal.FreeHGlobal(memPtr);   
                return objectIDInfo.IndexNumber;
       public ulong ApproachB()
               WinAPI.BY_HANDLE_FILE_INFORMATION objectFileInfo=new WinAPI.BY_HANDLE_FILE_INFORMATION();
                FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");
                FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);
                WinAPI.GetFileInformationByHandle(fs.Handle, out objectFileInfo);
                fs.Close();
                ulong fileIndex = ((ulong)objectFileInfo.FileIndexHigh << 32) + (ulong)objectFileInfo.FileIndexLow;
                return fileIndex;   
                Nice, I tried it out, but found one problem: it's not working for file like Microsoft Office suite (doc, docx, xls...) because everytime you made a changes, the Office tend to delete the file, and create a new file to replace it, this result in reference number changed, although the reference number still unique.  It cant work to detect changes in those files, and maybe some other programs will have similar approach too. So guess I will back to my CreationTime method...
– VHanded
                Mar 28, 2011 at 2:44
                FWIW, I could not get Ashley Henderson's ApproachA to work, but ApproachB did work. I was able to confirm VHanded's note that the FileID does change when you modify a Word (docx) document, which is too bad because Office files are the ones I wanted to track.
– Jeffrey Roughgarden
                Mar 31, 2012 at 0:46
                "everytime you made a changes, the Office tend to delete the file"...dammit, AutoCAD does this too. Back to FileSystemWatcher
– CAD bloke
                Nov 11, 2015 at 21:00

If you call GetFileInformationByHandle, you'll get a file ID in BY_HANDLE_FILE_INFORMATION.nFileIndexHigh/Low. This index is unique within a volume, and stays the same even if you move the file (within the volume) or rename it.

If you can assume that NTFS is used, you may also want to consider using Alternate Data Streams to store the metadata.

Thanks for the link. I actually found another API call that returns the same ID but requires a bit more work. I might post the code for I came up with. An Alternate Data Stream could be useful too as the only place Ill be encountering FAT is on USB keys / external drives. However I've heard some anti-virus / security software may have an issue with adding hidden data to files. – Ash Dec 8, 2009 at 12:35 The documentation for GetFileInformationByHandle says: "nFileIndexLow: Low-order part of a unique identifier that is associated with a file. This value is useful ONLY WHILE THE FILE IS OPEN by at least one process. If no processes have it open, the index may change the next time the file is opened." – Integer Poet Jul 27, 2010 at 19:58 Hm, this clause doesn't seem to be in the documentation on MSDN (anymore?). Empirically, I'm seeing the file index remain unique across reboots (on an NTFS filesystem). – sqweek Dec 12, 2017 at 9:56 The documentation also states: Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time. ( learn.microsoft.com/de-ch/windows/desktop/api/fileapi/… ) – Peanut Nov 7, 2018 at 14:30 On NTFS and REFS the file ID does not change for the lifetime of the file. That documentation was added for the FAT family (FAT, FAT32, exFAT) and possibly others, where renaming a file (for example) will change the file ID. needleinathreadstack.wordpress.com/2020/09/22/… – Alnoor Dec 10, 2021 at 14:35

One thing you can use for file uids is the create timestamp, just ensure when you scan the files into your program you tweak any createtimes that are the same as one already encountered so they are minutely different, as obviously a few files even on NTFS can otherwise have the same TS if they were created at the same time by a mass file copy, but in the normal course of things you won't get duplicates and tweaking will be few if any. d

The user also mentions unique directory identification. That process is a bit more convoluted than retrieving unique information for a file; however, it is possible. It requires you to call the appropriate CREATE_FILE function which a particular flag. With that handle, you can call the GetFileInformationByHandle function in Ash's answer.

This also requires a kernel32.dll import:

        [DllImport("kernel32.dll", SetLastError = true)]
        public static extern SafeFileHandle CreateFile(
            string lpFileName,
            [MarshalAs(UnmanagedType.U4)] FileAccess dwDesiredAccess,
            [MarshalAs(UnmanagedType.U4)] FileShare dwShareMode,
            IntPtr securityAttributes,
            [MarshalAs(UnmanagedType.U4)] FileMode dwCreationDisposition,
            uint dwFlagsAndAttributes,
            IntPtr hTemplateFile

I'll flesh out this answer a bit more, later. But, with the above linked answer, this should begin to make sense. A new favorite resource of mine is pinvoke which has helped me with .Net C# signature possibilities.

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.