MFT Analysis

for Incident Responders

By Willi Ballenthin / @willballenthin

What's going on in here?

  • New Technology File System (NTFS) used on most Windows machines
  • Master File Table (MFT) contains most metadata for entire file system
  • found in [filesystem root]/$MFT, once per filesystem
  • rather than rely on layers of technology, work with directly with the MFT

Direct MFT analysis for great good

  • MFT is typically small, (although dedicated 12.5% of volume), often under 100MB
  • compress well (usually ~90% ratio)
  • answer most questions you have
  • good for data recovery
  • flexible set of analysis tools

Don't rely on a tool until you know what its doing.

Structures

  • MFT is variably sized file
  • contents are fixed sized records: 1024 bytes
  • (at least) one record for each file/directory
  • MFT reference: index into the MFT array
  • references among records form FS tree

MFT Record

  • static size: 1024 bytes
  • small header
  • contiguous attributes, stored contiguously

 typedef struct {
 /*Ofs*/
 /*  0*/ NTFS_RECORD_TYPE magic;
         le16 usa_ofs;          
         le16 usa_count;        
 /*  8*/ le64 lsn;             
 /* 16*/ le16 sequence_number; 
 /* 18*/ le16 link_count;      
 /* 20*/ le16 attrs_offset;    
 /* 22*/ MFT_RECORD_FLAGS flags;
 /* 24*/ le32 bytes_in_use;     
 /* 28*/ le32 bytes_allocated;  
 /* 32*/ leMFT_REF base_mft_record;
 /* 40*/ le16 next_attr_instance;
 /* 42*/ le16 reserved;          
 /* 44*/ le32 mft_record_number; 
 /* sizeof() = 48 bytes */
 } __attribute__ ((__packed__)) MFT_RECORD;

Attributes

  • 17 standard attributes, extensible by user
    • <>16d/0x10 - Standard Information ($SI)
    • 48d/0x30 - Filename Information ($FN)
    • 128d/0x80 - Data
    • 144d/0x90 - Directory Index Root
    • 160d/0xA0 - Directory Index Allocation
  • common structure:
    • DWORD attribute type
    • DWORD attribute size
    • offset 0x72, attribute specific data

Residency

  • all attributes may not fit in static size
  • resident attribute: content stored inline
  • non-resident attribute: attribute points to external data runs
  • examples:
    • small file: stored within MFT
    • large file: fragmented across disk
  • note: things get really complex for huge files

Standard Information

  • always resident, one per entry
  • contains good stuff:
    • MACB timestamps
    • hidden? system? other flags
    • quota info
    • not size
    • not filename

Filename Information

  • maybe multiple per entry, at least one resident
  • types: 8.3, Unicode, POSIX
  • contains good stuff
    • filename
    • MACB timestamps of filename
    • parent directory MFT reference
    • size sorta

Timestamps

  • At least eight (8) timestamps per entry
  • Standard Information
    • For: file content
    • Shown by Explorer, most forensic tools
    • Easily modified via SetFileTime API
  • Filename Information
    • For: filenames
    • Difficult to stomp, need unusual copy operation (SetMACE) or kernel driver

Tools

Tool: MFTView

  • boring name, sorry
  • interactive inspection of MFT with tree view
  • some features:
    • strings, hex view
    • integrated INDX root parsing
    • data extraction & cluster run calculation
  • source

Tool: list-mft

  • compare with: AnalyzeMFT
  • offers better memory usage, speed
  • supports:
    • standard information
    • multiple filename information
    • INDX root
  • renamed from MFTINDX.py
  • source

Tool: get_file_info

  • manual MFT record inspection
  • timeline all embedded timestamps
  • extract strings from allocated/slack space
  • source

Tool: fuse-mft

  • FUSE driver for MFTs
  • mount an MFT and explore using favorite CLI/GUI tools
  • read from resident files, all metadata mirrors MFT entries
  • check /path/to/file::meta for goodies
  • source

Advanced Topics

Record Slack Space

  • NTFS doesn't zero out records, overwrites old data
  • new record content often smaller, often recoverable artifacts
  • inspect final bytes for:
    • UTF-16LE strings
    • timestamps
  • DO THIS! automated tools don't do this for you

Directory Indices

  • commonly called INDX buffers
  • B+ tree for fast lookup of filenames (usually)
    • page size is 4096 bytes (== cluster size)
    • key is $FN attribute
    • value is MFT reference
  • attribute INDX_ROOT always resident, usually ~4 entries
  • attribute INDX_ALLOCATION always non-resident
  • use INDXParse.py to recover file metadata

Alternate Data Streams

  • programs may store "hidden" data in ADSs
  • unnamed $DATA attribute is a file's content
  • NTFS supports extra named $DATA attributes
  • file_basename:ads_name
  • examples:
    • malware stores configuration in ADS
    • browsers mark downloaded files with :Zone.Identifier
  • use SysInternals streams.exe

File System Tunneling

  • file created within 15 secs of deleted file inherits metadata
  • supports programs that copy/replace file during a save operation
  • includes, but not limited to, timestamps
  • research topic: extract/carve "quarks" from metadata cache

Rebuilding FS Tree

  • top down parsing of FS is fast, but requires non-resident directory indices
  • bottom up reconstruction is self-contained in MFT
  • walk all $FN parent references
  • shortcut: cache common subtree roots

KBOTD(s)

Directory Indices are not Unique

  • Also indices for $SII, $Quota, $ObjId
  • NTFS supports multiple index views over same data(!)

B+ indices support variable sized keys

Ownership is complex to compute

$SI → $SII → SID → Registry/AD → Username

THE END

BY Willi Ballenthin