Investigating Office365 Account Compromise without the Activities API

With the recent demise of the Office 365 Activities API, David Cowen at HECFBlog has chosen to focus his recent Sunday Funday Challenge on the remaining evidence sources available when investigating instances of Office365 account compromise. David posed the following question:
“Explain in a compromise of a Office365 account what you could review in the following circumstances.
  • Scenario a: only default logging in a E3 plan
  • Scenario b: Full mailbox auditing turned on
You are attempting in both scenarios to understand the scope of the attacker's access.”
The first point to note is that a compromise of Office 365 (while commonly referred to as Business Email Compromise (BEC)) is not necessarily limited to email accounts. Depending on how an organisation employs Office 365 they may host a wealth of information besides just email and attachments in O365, much of which could be valuable to an attacker. In the case of the in-scope E3 plan, each compromised user account could potentially expose:
  • Exchange — Email messages, attachments and Calendars (Mailbox size up to 100GB)
  • OneDrive — 1TB per user, unless increased by admins to up to 25TB.
  • SharePoint — Whatever sites that user has access to.
  • Skype — Messages, call and video call history data
  • Microsoft Teams — Messages, call and video call history data as well as data within integrated apps.
  • Yammer — Whatever it is people actually do on Yammer. Are you prepared for a full compromise of your organisation's memes, reaction gifs and cat pictures?

All of that before you concern yourself with the likelihood of credential reuse, passwords which may be stored within O365 (Within documents and emails) for other services, delegated access to other mailboxes and MDM functionality.

A Short(er) Answer

David has chosen to focus on an E3 Office 365 instance, with and without additional logging functionality enabled. Some evidence sources available in these two circumstances will be as follows.

Scenario a: only default logging in a E3 plan
Below is a non-comprehensive list of evidence sources which may be available to an examiner to assist in understanding the scale/scope of an O365 compromise:
  • Unified Audit Log, via Audit Log Search in the Security & Compliance Centre and accessible using Search-UnifiedAuditLog' cmdlet. This will need to be enabled if not already enabled and appears to provide limited retrospective visibility if enabled after the fact.
  • Mailbox Content
  • Read Tracking 
  • Message Tracking Logs
  • Mailbox Rule information
  • Proxy Logs/ DNS Logs/ Endpoint AV Logs / SIEM
  • Office 365 Management Activity API
  • Azure Active Directory reports and Reporting Audit API (With Azure AD P1/P2)

Scenario b: Full mailbox auditing turned on
By default, Auditing is not enabled, nor are the more granular Mailbox Auditing and SharePoint Site Collection Audit options. However, if we assume that 'audit log search' has been enabled as well as the optional logging associated with enabling 'mailbox auditing' and that audit has been configured for all SharePoint site collections then the following additional evidence sources become available.
  • Unified Audit Log, includes events recorded as a result of enabling 'mailbox auditing'.
  • SharePoint Audit log reports

It should be noted that simply enabling mailbox audit logging for all mailboxes is not enough to capture all useful events. By default, only the 'UpdateFolderPermissions' action is logged with additional events requiring configuration, these include Create, HardDelete, MailboxLogin, Move, MoveToDeletedltems, SoftDelete and Update events.

SharePoint audit logging is pretty granular and, in my experience, rarely enabled. However, if correctly configured a record of user actions including document access, modification and deletion actions can be generated.

These evidence sources, their usefulness and some suggested methodologies to leverage them are outlined in the following sections. In a number of cases I have listed links for suggested additional reading as many of these topics have been well documented by Microsoft or others before me.

Unified Audit Log

The Unified Audit Log (UAL) is currently the single best source of evidence (when available) for Office 365 account compromise investigations. If enabled, user and admin activity can be searched via the Security & Compliance Center or using the ‘Search-UnifiedAuditLog’ cmdlet. Logged activity from your tenant is recorded in the audit log and retained for 90 days. It should be noted that some latency occurs between events occurring and appearing in logs, in some cases (and for some event types) Microsoft detail that this can be up to 24 hours. I have had mixed results in testing whether events prior to enabling auditing become searchable if auditing is enabled after the fact and I plan to perform additional testing and update this post with the results.

By default, Audit log Search is not enabled and attempts to access or use the Audit Log Search functionality within the Security & Compliance Centre will be met with various errors and warnings:

Likewise, attempts to use the `Search-UnifiedAuditLog' cmdlet will fail.

UAL search functionality can be enabled with the following PowerShell command: 

Set-AdminAuditLogConfig -UnifiedAuditLogIngestionEnabled $true

Per Microsoft’s 'Search the audit log in the Office 365 Security & Compliance Center' support article, by default the UAL will contain records of:
  • "User activity in SharePoint Online and OneDrive for Business
  • User activity in Exchange Online (Exchange mailbox audit logging)
  • Admin activity in SharePoint Online
  • Admin activity in Azure Active Directory (the directory service for Office 365)
  • Admin activity in Exchange Online (Exchange admin audit logging)
  • User and admin activity in Sway
  • eDiscovery activities in the Office 365 Security & Compliance Center
  • User and admin activity in Power BI for Office 365
  • User and admin activity in Microsoft Teams
  • User and admin activity in Yammer
  • User and admin activity in Microsoft Stream"

The same article includes the following important note:
“Important: Mailbox audit logging must be turned on for each user mailbox before user activity in Exchange Online will be logged. For more information, see Enable mailbox auditing in Office 365.”
Additionally, if the Audit Log Search functionality is enabled after the fact it will take some hours to become available and thereafter 24 hours for some events to be populated.  

If enabled the Audit Log can be searched and exported in one of two ways. Firstly, it is accessible via the Security & Compliance Center by navigating to Search & Compliance -> Search & Investigation -> Audit log search. This will present you with a screen as below which can be used to perform searches and export results:

Alternatively, the ‘Search-UnifiedAuditLog’ cmdlet can be used to perform searches and output targeted results. Some useful commands are provided below:

Dump ALL available Audit Data within a date range:

Search-UnifiedAuditLog -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] | Export-csv "E:\Cases\InvestigationXYZ\BadIPsActivity.csv"

Unsurprisingly on even medium size tenants or where the date range is too large this fails. Specifically, due to the maximum number of records which can be retrieved during a particular session being capped at 50,000. The Microsoft blog post 'Retrieving Office 365 Audit Data using PowerShell' addresses this issue and provides a script which can assist. In any event, more targeted searches are advisable:

Dump ALL available Audit Data within a date range for a particular user:

Search-UnifiedAuditLog -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] -UserIds [USER,USER,USER] | Export-csv "E:\Cases\InvestigationXYZ\BadIPsActivity.csv"

Review failed login attempts for all users:

Search-UnifiedAuditLog -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] -Operations UserLoginFailed -SessionCommand ReturnLargeSet -ResultSize 5000 | Export-csv "E:\Cases\InvestigationXYZ\FailedLogins.csv"

Find all log entries associated with a known malicious IP(s) during a specific date range:

Search-UnifiedAuditLog -IPAddresses [IPAddress] -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] -ResultSize 5000 | Export-csv "E:\Cases\InvestigationXYZ\BadIPActivity.csv"

And for a list of IPs:

Search-UnifiedAuditLog -IPAddresses [IPaddress1],[IPaddress2] -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] -ResultSize 5000 | Export-csv "E:\Cases\InvestigationXYZ\BadIPsActivity.csv"

Particular record types can be targeted using the '-RecordType' attribute and a list of attributes is provided in the MS documentation, here.

As previously mentioned, having Mailbox Auditing enabled will cause additional events to be logged in the UAL. To enable Mailbox Auditing for all mailboxes the following command can be run:

Get-Mailbox -ResultSize Unlimited -Filter {RecipientTypeDetails -eq "UserMailbox"} | Set-Mailbox -AuditEnabled $true

Separately from enabling UAL Search, additional logging detail can be captured by enabling ‘Mailbox Auditing’. The resulting events will be accessible using the UAL Search but it should be noted that if auditing is not enabled prior to an incident then visibility cannot be added after the fact. I have only performed limited testing of this over a short period so would be interested to hear if anyone has contrary experience.

By default, enabling mailbox auditing will only record 'UpdateFolderPermissions' events so additional configuration is required to ensure that other owner actions are captured. The below command will enable all available owner actions for all mailboxes:

Get-Mailbox -ResultSize Unlimited -Filter {RecipientTypeDetails -eq "UserMailbox"} | Set-Mailbox -AuditOwner @{Add="MailboxLogin","HardDelete","SoftDelete","FolderBind","Update","Move","MoveToD eletedItems","SendAs","SendOnBehalf","Create"}

Be aware that enabling all of these auditing settings for an entire tenant will flood the UAL with audit events and can cause a lot of noise (and potentially a performance impact for searches). More details on enabling Mailbox Auditing is available here. The data is also apparently stored in such a way that it contributes to their mailbox storage allocation and this can cause issues if recorded activities become too large.

To review the audit status of a particular account the following command can be used: 

Get-Mailbox -Identity [target mailbox] | fl name,*audit*

Once enabled the same UAL queries above will return more detailed results of user activity within mailboxes and most notably MailboxLogin events. In addition to the ‘Search-UnifiedAuditLog’ cmdlet there is also a 'Search-MailboxAuditLog' cmdlet which can be employed, documentation for which can be found here.

Some further useful resources on UAL and Mailbox Auditing are as follows:
While traditional logs (enabled or otherwise) are always a go-to source of evidence there are a number of other valuable sources which can help to identify compromised accounts and understand the scope and impact of an account compromise.

Mailbox content

The content of mailboxes is important in BEC cases for several reasons; being able to search, access and review it can help answer the following questions:
  • Who currently has a known malicious email message in their inbox?
  • Who may have read a malicious email message?
  • What was the provenance, payload and content of a malicious email message?
  • What items have been sent from a compromise account?
  • What the possible exposure may be?

The content of a mailbox, or all mailboxes, as they exist at the time of analysis can be determined through the use of a number of PowerShell cmdlets. It should be noted that mailbox content searches/reviews can be targeted or tenant wide and the usefulness of these investigative methods will often be contingent on the number of users in a tenant.

Please also note that the use of these cmdlets to identify and delete malicious messages is not without risk. Typos, unnecessarily broad queries and other unforeseen complications can become CV generating moments when you delete data you shouldn't have.

Hunting for Known Malicious Messages
It is quite common in the early phases of incident investigation for IT/IR staff to be provided with a copy (often forwarded) or a description of a phishing message. It can be desirable to capture a forensically sound copy of such a message for analysis.
Per it's Microsoft documentation "you can use the Search-Mailbox cmdlet to search messages in a specified mailbox and perform any of the following tasks:
  • Copy messages to a specified target mailbox.
  • Delete messages from the source mailbox. You have to be assigned the Mailbox Import Export management role to delete messages.
  • Perform single item recovery to recover items from a user's Recoverable Items folder.
  • Clean up the Recoverable Items folder for a mailbox when it has reached the Recoverable Items hard quota."

"Note: By default, Search-Mailbox is available only in the Mailbox Search or Mailbox Import Export roles, and these roles aren't assigned to any role groups. To use this cmdlet, you need to add one or both of the roles to a role group (for example, the Organization Management role group). Only the Mailbox Import Export role gives you access to the DeleteContent parameter."

Additionally, as Office 365 E3 Subscriptions (or Exchange Plan 2) come with eDiscovery functionality, we can also leverage the Discovery Search Mailbox and eDiscovery functionality to assist in collating the messages we wish to analyse. The below examples show queries which can be used to identify and copy samples of malicious messages:

Get-Mailbox | Search-Mailbox -SearchQuery "Subject:phish" -TargetMailbox "Discovery Search Mailbox" -TargetFolder "IncidentXYZ" -LogLevel Full

This command searches all mailboxes for email messages containing the string "phish" in their subject and copies them to the Discovery Search Mailbox within a folder called 'IncidentXYZ', if the folder does not exist it will be created. Setting the ‘-LogLevel’ parameter to Full will cause a CSV of results to be generated and emailed to the target mailbox, this can be extremely useful so is recommended.

Alternatively, we can export to any other mailbox as below:

Get-Mailbox | Search-Mailbox -SearchQuery "Subject:phish" -TargetMailbox "anyone@yourdomain.com" -TargetFolder "IncidentXYZ" -LogLevel Full

Note however that the TargetMailbox will be excluded from any search so you better make sure it doesn't contain any respondent data or it will be missed.

In both cases it can be preferable to get a feel for the number of matching responses prior to executing a copy command, this can be achieved with the '-EstimateResultOnly' parameter which will perform the search but not copy any messages, example below:

Get-Mailbox | Search-Mailbox -SearchQuery "Subject:phish" -EstimateResultOnly

In these examples we have relied upon a known string within the subject however there are a number of search criteria which can be used as alternatives or in combination, e.g.:

sent:"last week" 

The last, and indeed any other date queries can accept a date (YYYY-MM-DD), date range (YYYY-MM-DD..YYYY-MM-DD) or date interval (e.g. today, yesterday, this week, this month). A fuller list of queryable attributes and descriptions is available here.

Also associated with eDiscovery functionality and again requiring the Mailbox Search role are the New-ComplianceSearch, Get-ComplianceSearch and Start-ComplianceSearch cmdlets.

Performing a Compliance Search will allow you to group messages for export, preservation or deletion. As with all of these queries is important to be specific enough, so as to only capture the messages of interest, as such as source email addresses and unique strings from subjects (particularly when combined tight date ranges) make for good queries.

An example command below will capture all messages from 'hax0r@baddomain.cf containing the string 'phish' in the subject and name the search ‘IncidentXYZ-PhishingMessages’.

New-ComplianceSearch -Name "IncidentXYZ-PhishingMessages" -ContentMatchQuery " (From:hax0r@baddomain.cf) AND (Subject:"*phish*")"

While we are here, we are also able to remove identified malicious messages using New-ComplianceSearchAction, assuming we have already used New-ComplianceSearch to identify the malicious messages, as in the above example:

New-ComplianceSearchAction -SearchName "IncidentXYZ-PhishingMessages" -Purge - PurgeType SoftDelete

This methodology is detailed in greater detail here.

Extracting a sample of malicious message(s)
An alternative method for searching for known malicious messages to is to use the Security & Compliance Center as detailed here. The Security & Compliance Center is probably the easiest way to extract sample messages as you can perform searches, review results then download of individual email messages as .eml or groups to a .pst, all from the comfort of a GUI.

Analysis of such samples help in understanding the provenance, payload and content of a malicious email message. The same methodology can be used to export email messages sent from compromised accounts by malicious actors or to perform wholesale exports of mailbox(es) for legal review when trying to assess the impact associated with a mailbox compromise.

Read Tracking

If enabled, and unfortunately disabled by default, Read Tracking via the `Get-MessageTrackingReport' cmdlet can be used to determine whether email messages within the organisation have been read. Commonly attackers will compromise one account and then use it to phish other accounts for credentials so being able to quickly determine how many users received and read these messages can be helpful.

You can check whether Read Tracking is enabled with the following PowerShell command: 

Get-OrganizationConfig I Select ReadTrackingEnabled

If it is enabled then you are in luck and you can follow the below guides, or associated scripts to confirm who has read a particular email message:

Message tracking can be enabled with the below command: 

Set-OrganizationConfig -ReadTrackingEnabled $true

However, it should be noted that this will not have a retroactive effect. It needs to have been enabled before the notable messages were sent.

Using the methods detailed above, we can identify, extract and if required delete malicious messages from the mailboxes of users. However, content searches are rarely the most efficient method to answer some of these questions. Commonly we just require details of all recipients of a particular malicious email, or we want details of all accounts who have interacted with known malicious email addresses or maybe we want a list of all addresses which were contacted from a compromised account during the known window of compromise. In these instances, Message Tracking Logs can be very helpful.

Message Tracking Logs

Message tracking logs provide a record of messages which have been transmitted into, out of and within a tenant and as such can be invaluable in instances of BEC.
An example command is detailed below:

Get-MessageTrace -StartDate [START_DATE] -EndDate [END_DATE} -PageSize 5000 I Where {$ .Subject -like "*phish*"} I ft -Wrap

This command searches Message Trace Logs for messages sent between two dates, with the string "phish" in the subject. While it increases the page size to 5000 (the maximum) from a default of 1000 this still may not be adequate in a large tenant or in a wide date range. In these cases a script may be the best solution and one such script can be found here.

Note that the StartDate can't be greater than 30 days from the date of the script running. Also, the results can be truncated when long lists of recipients are associated with a single message and any mailing lists will have to be manually enumerated to confirm which users would be expected to receive an email message addressed to a group or shared mailbox.

It can also be useful to target messages originating from known bad email addresses or domain e.g.: 

Get-MessageTrace -SenderAddress *@baddomain.cf I ft -Wrap

There are a number of other uses for the Message logs, and different ways queries can be used. The Microsoft Documentation associated with the cmdlet provides a number of examples and details the different query constraints which can be used.

Additionally, message trace information can also be sought within the Exchange Admin Console by navigating to EAC -> mail flow -> message trace.

Mailbox Rule information

A common technique in cases of Business Email Compromise is the use of rules by attackers to cover their tracks. The presence of maliciously added rules can often act as a quick and effective indicator to identify accounts which have been compromised.

Attackers will commonly employ sets of rules which I have come to refer to as "folder and forward" rules. They will set mailboxes to forward emails (either wholesale or subject dependent) to an external email address so they don't need to monitor the account constantly. They also often use rules to hide their malicious messages by employing foldering rules. These rules will cause any message with a specific subject (i.e. the subject they use in their fraudulent or phishing emails) to be marked as read and send directly to a folder such as Junk, RSS or any other folder the user isn't likely to review as soon as the message is received.

In some organisations rules of this type are rare and therefore stand out immediately but YMMV, particularly in organisations where the use of these types of rules is widespread.

The following PowerShell command can help to identify this activity:

foreach ($user in (get-mailbox -resultsize unlimited).UserPrincipalName) {Get-InboxRule -Mailbox $user I Select-Object MailboxOwnerID,Name,Description,Enabled,RedirectTo,MoveToFolder,ForwardTo | Export-CSV E:\Cases\InvestigationXYZ\AllUserRules.csv -NoTypelnformation -Append}

This command will iterate through the list of all users, returning details of the rules they have configured and wil produce a CSV of the results for analysis.

In addition to the use of rules an attacker can employ the forwardingSMTPAddress feature to forward all email messages received at a compromised account to another address. This can be identified with the below command.

Get-Mailbox -ResultSize unlimited | where { $.forwardingSMTPAddress -ne $NULL }

This command will produce a listing of all accounts where the forwardingSMTPAddress is not blank. It isn't uncommon for the above command to return no results as the use of forwardingSMTPAddress is not widespread in my experience. But blank results can make some uneasy so an alternative approach is to use the below command:

Get-Mailbox -resultSize unlimited | select UserPrincipalName,ForwardingSmtpAddress,DeliverToMailboxAndForward

This command will produce a listing of all accounts and include details of Forwarding Address (if enabled) and DelierToMailboxAndForward status.
We can pipe these commands to csv for later review (recommended) as follows:

Get-Mailbox -resultSize unlimited | select UserPrincipalName,ForwardingSmtpAddress,DeliverToMailboxAndForward | Export-csv E:\Cases\InvestigationXYZ\FullForwarding.csv -NoTypelnformation

While all of the above listed commands are currently written to search a full tenant, if required they can be modified to targets specific mailboxes or groups/lists of users.

Proxy Logs/ DNS Logs/ Endpoint AV Logs

I raise these evidence sources as a reminder not to be blinkered by the Office 365 component of a compromise. A significant number of users will never use Office 365 off premises and as such evidence of accessing phishing links, of malware infection and other user activity may be in more traditional locations.

Commonly it is desirable to understand which users have not just received a phishing email but also followed malicious links. While the use of DNS logs, Proxy and Firewall logs may not provide 100% coverage they can be an invaluable source of evidence in identifying at least some of the impacted users.

Office 365 Management Activity API

While Unified Audit Log Search may not be enabled on a tenant, much of the same data is still accessible (albeit with less granularity where mailbox auditing is not enabled) via use of the Office 365 Management Activity API.

This API is worthy of a separate post on its own and this post is long enough as it is, so I won't go into full detail here, but in the meantime some useful resources are provided below:

Additionally, while the above resources would assist in writing your own tool/application to query the API for useful information some of the hard work has already been done for you and some example tools and scripts which make use of the API are as follows:

AdminDroid Office 365 Reporter is one third party tool I have used in the past as it was used by a client and it leveraged the API and allowed for user activity reports along with many other useful reports to be pulled.

Likewise, if the organisation in question has a SIEM with O365 integration it is likely already using the Office 365 Management Activity API to pull data out of the Audit logs and this may be retained longer than the 90-day limit. In such cases the organisations own SIEM may be the best source to query this data.

Azure Active Directory reporting audit API

Azure AD has an 'audit logs activity report' and 'sign-ins activity report', as well as 'Risky sign-ins' and 'Users flagged for risk' functionality available. These reports and metrics however require an Azure Active Directory premium subscription (P1 or P2).

If at least one user has a license for AzureAD Premium then the sign-ins activity report within the AzureAD Porte can be used to provide information regarding sign-ins. These reports are accessible via the GUI and can be downloaded as CSV. Further details are provided here.

While I have used the AzureAD GUI to pull logs I haven't played with the associated API yet and need to perform some testing. I am particularly interested to determine whether adding an Azure AD P1 subscription to one user post incident will allow for historical visibility. My limited testing of the GUI suggests that logs from prior to an AzureAD subscription being added are available once the subscription starts, but I have red contrary reports in a number of places.

A PowerShell script called 'Pull Azure AD Sign In Reports' has been put together by Microsoft employee Tim Springston and will pull these reports if the appropriate subscription is available.

Other Sources

No doubt other evidence sources will exist depending on the type of incident which occurs. One notable complication will be if an administrative account is compromised as there may be concerns that unauthorised admin actions have been performed.

Besides various queries which can be used to search for evidence of admin credential abuse (e.g. looking for recently added and modified accounts etc) it is also possible to use the Search-AdminAuditLog cmdlet and Admin Audit reports within the Security & Compliance Center to investigate such concerns.

Additionally, if evidence of unauthorised SharePoint access is identified or suspected, then having audit settings enabled for the associated site collection will be invaluable. Details of Sharepoint audit settings are available here. The associated events will be populated in the UAL and an example command is as follows:

Search-UnifiedAuditLog -StartDate [YYYY-MM-DD] -EndDate [YYYY-MM-DD] -RecordType SharePointFileOperation -Operations FileAccessed -Sessionld "SharepointInvestigation" -SessionCommand ReturnNextPreviewPage

This command will return FileAccess events during a specified date range for all sites where this event is recorded.


Hopefully this post is useful to those engaged in investigating instances of Office 365 account compromise. No doubt there are other scenarios and other evidence sources that I wont have thought of but that's what the comments section is for. I will keep this post updated as I continue to test some areas further.


exFAT Timestamps: exFAT Primer and My Methodology

Following my quick and dirty post on exFAT timestamp behavior I wanted to follow up with a fuller post (or posts) detailing my methodology and observations. I started looking into the workings and different implementations of exFAT because it had been raised David Cowen in one of his recent ‘Sunday Funday' challenges but everywhere I looked there was something interesting requiring further analysis. To that end I will be following up with a few posts detailing how different operating systems handle exFAT and how different tools (forensic and non-forensic) interpret the filesystem.

The original question posed by David was:
ExFAT is documented to have a timezone field to document which timezone a timestamp was populated with. However most tools just see it as FAT and ignore it. For this challenge document for the following operating systems how they populate ExFAT timestamps and which utility will properly show the correct values.
Operating systems:
  • Windows 7
  • Windows 10
  • OSX High Sierra
  • Ubuntu Linux 16.04
At first this seemed like a pretty basic challenge, but it wasn't until I started poking around that I realised the myriad of different issues and inconsistencies in how alternative operating systems implement exFAT. 

This post will serve as a brief introduction to exFAT, focusing principally on how it handles the recording of MAC times. I will also detail my testing methodology for those who are interested before I expand on results in some follow up posts.

exFAT Primer

I won’t go into full detail of the history, functionality and workings of exFAT. Principally because I would all too quickly expose my ignorance, and more importantly, a much better job than I could hope to match up to has already been done by Robert Shullich and Jeff Hamm with various resourced which I came to rely upon when researching exFAT. 

Key resources/ references were:
Robert Shullich’s GCFA Gold Paper ‘Reverse Engineering the Microsoft Extended FAT File System (exFAT)’ (2009)
Jeff Hamm’s paper ‘Extended FAT File System’ (2009)
Jeff Hamm’s post on the SANS Digital Forensics and Incident Response Blog entitled ‘exFAT File System Time Zone Concerns’ (2010)
Jeff and Robert presentation/ talk entitled ‘exFAT (Extended FAT) File System  - Revealed and Dissected' (2010)

I highly recommend the above resources if you need a crash course in exFAT.

As this research focused on time zone field used in exFAT directory records and how exFAT timestamps are populated by different operating systems I have ignored significant other portions of the filesystem in this primer. 

Within exFAT file metadata is stored with Directory Entries. The Directory Entry associated with any particular file will comprise at least three records, a Directory Entry Record, Stream Extension and Filename Extension. Records are 32 bytes long with the first byte being a Type Code which identifies the type of record. Type Codes are addressed in more detail later. 

Due to the details of the testing performed later, the overwhelming majority of Directory Entries reviewed in this post will contain only three records however it should be noted that it is common for a Directory Entry to contain more than three records. This is caused when a filename exceeds a certain length. The maximum filename length which a single Filename Extension Record can support is 15 characters with longer filenames requiring additional Filename Extension Records. Examples of both are seen in the screenshot below:

This screenshot is associated with an exFAT volume named ‘WinFormat’, at the root of the volume there is one directory ‘System Volume Information’ (highlighted in red) and four files, ‘16_04_txt’, ‘18_04.txt’, ‘Win10.txt’ and ‘Win7.txt’, the last of which is highlighted in green. 

Focussing first on the root entry, we have Directory Entry Records, a 'Volume Label Directory Entry' (Starting 0x83), an 'Allocation Bitmap Directory' Entry (Starting 0x81) and an 'UP-Case Table Directory Entry' (Starting 0x82). These are covered in more detail in the recommended reading, however it is the Directory Entries associated with files and containing our file metatada which we are most interested in. Incidentally, 0x83 is the Type Code associated with a named Volume Label Directory Entry, if the volume were unnamed, the Type Code would be 0x03.

This screenshot shows the two entries associated with a directory (Red) and file (Green) respectively, the latter serves as an example of the above detailed use of additional records when a filename is too long. We can see that there are three records associated with the red Directory Entry, a Directory Entry Record (0x51), Stream Extension (0xC0) and Filename Extension (0xC1) while our Directory Entry highlighted in green is the same but contains 2 Filename Extensions (Two concurrent records with the 0xC1 Type Code).

In the following screenshot we dive into the Directory Entry associated with the same file. I couldn’t find any good worked examples for manual decoding so made this one and hope it is useful. 

A brief description of the components is as follows:

Type Code – Details the Record Type. When dealing with active (not deleted) Directory Entry Records, we are interested in 0x85, 0xC0 and 0xC1. We will only deal with these three as we explore Timestamp and Timezone behaviour.

Notable Type Codes you are likely to encounter are:

Type Code  Definition
0x85  Directory Entry Record
0x83  Volume Name Record, Master Entry (Named Volume)
0x03  Volume Name Record. Master Entry (Unnamed Volume)
0x82  Up-Case Table Logical Location and Size
0x81  Bitmap Logical Location and Size
0xC0  Directory Entry Record, Stream Extension
0xC1  Directory Entry Record, Filename Extension
0x05  Deleted File Name Record
0x40  Deleted File Name Record, Stream Extension
0x41  Deleted File Name Record, Filename Extension

Number of Secondary Entries – The number of secondary records associated with the entry. Mentioned previously, my test files generally have short filenames and so their Directory Entries comprise a Directory Entry Record, Stream Extension and Filename Extension, as such they have a value of 2 for the Number of Secondary Entries.

Checksum – Checksum of Record Entry, beyond the scope of this analysis.

Flags – Metadata Flags (Read Only, Hidden, System File and Archive)

Created/ Last Modified / Last Accessed – Now we are talking… These are 32-bit MSDOS timestamps with the associated limitation of having a granularity of 2 seconds. This limitation is accounted for by the use of additional fields to permit greater granularity as detailed later.
A detailed explanation of the 32-Bit Windows Time/Date Format can found here, but to be honest most analysis tools or hex editors are perfectly capable of parsing it. 

Creation/ Last Modified centisecond offset – Additional fields associated with the Creation and Last Modified timestamps allow an operating system to record a value between 0 and 199 to denote the number of centiseconds which should be added to the recorded MSDOS timestamp. Note, no such field exists for the ‘Last Accessed’ value.

Created Time Zone Code / Modified Time Zone Code / Accessed Time Zone Code – 
The Time Zone Code is a one byte value, the most significant bit denotes whether the application/OS which last updated the timestamp supported and made use of the Time Zone Code and therefore whether the time zone offset should be applied during subsequent interpretation. The required offset is then stored as a 7-bit signed integer. The integer itself represents 15-minute increments from UTC, allowing for the accommodation of timezones which do not fall on the hour. In case your were wondering, examples include Eucla, Western Australia (UTC +8:45), Chatham Islands, New Zealand (UTC +12:45) and Nepal (UTC +5:45)

For ease of reference I have transcribed some common (read: “not Eucla, WA”) timezones and their corresponding Time Zone code.

Hex TimeZone Hex TimeZone
0xB8 UTC+14 0x80 UTC
0xB4 UTC+13 0xFC UTC-1
0xB0 UTC+12 0xF8 UTC-2
0xAC UTC+11 0xF4 UTC-3
0xA8 UTC+10 0xF0 UTC-4
0xA4 UTC+9 0xEC UTC-5
0xA0 UTC+8 0xE8 UTC-6
0x9C UTC+7 0xE4 UTC-7
0x98 UTC+6 0xE0 UTC-8
0x94 UTC+5 0xDC UTC-9
0x90 UTC+4 0xD8 UTC-10
0x8C UTC+3 0xD4 UTC-11
0x88 UTC+2 0xD0 UTC-12
0x84 UTC+1

Length – Filename Length (Unicode Characters)

Filename Hash – a hash of the filename, used for expediting searches.

Valid Data Length – Logical File Size (Bytes)

First Cluster Address – Address of the first cluster...

Data Length – Logical File Size (Bytes)

Filename – Unicode string of filename

On the basis of the above it is clear that if we concerned with having a correct understanding of the time/date artefacts associated analysed evidence then there are 8 fields we are interested in, the Created, Modified, and Accessed TimeStamps, each of their corresponding Time Zone Codes and the 0-199 'Centisecond Offset’ fields associated with the Created and Modified Timestamps.

OS Population of Time Stamp and Time Zone fields

The first half of the puzzle is understanding how different operating systems employ the different fields. David posed the question in relation to Windows 7, Windows 10, OSX High Sierra and Ubuntu Linux 16.04, however as I am a glutton for punishment I have chosen to extend the scope to Windows 8.1 and Ubuntu 18.04. I will note here that I strongly considered playing with XP with added exFAT support and Vista as I know that they throw a further spanner in the works with partial implementation of Microsoft’s own filesystem’s functionality… but maybe another day.

Before I detail the testing performed I want to highlight some limitations (deliberate and otherwise):
  1. I am only considering removable media, which is to say, I am considering the impact on an analyst who receives a piece of removable media with no indication as to the configuration of system(s) which have interacted with it. 
  2. I have had to employ virtual machines for some of the OS testing. My assumption is that USB passthrough within VM Worksation works at the physical layer and that a mass storage device used in this way will operate in the same manner as a directly connected device. But it IS an assumption.
  3. I’m a human being. Much of the operations were deliberately performed manually as opposed to using scripts. As such time/date values which I recorded in notes are expected to be off by a few seconds. This could cause errors in my interpretation of the results of any testing, but it was considered throughout the process.
Test Environment
The following systems were used for testing:
  • Windows 10 – Custom Build PC (Windows 10.0.17134.112)
  • Windows 8.1 – VMWare Workstation 14 hosted VM using USB passthough (Version 6.3.9600)
  • Windows 7 – VMWare Workstation 14 hosted VM using USB passthough (Version 6.1.7601)
  • Ubuntu 18.04 – VMWare Workstation 14 hosted VM using USB passthough (Fresh install required addition of exfat-fuse driver)
  • Ubuntu 16.04 – VMWare Workstation 14 hosted VM using USB passthough (SIFT Workstation)
  • OSX High Sierra – MacBook Pro installed with OSX 10.13.3
Tests Performed
Broadly, my testing procedure was as follows:

Volume creation tests
Connect a physical USB device to each of the operating systems and format the drive as exFAT, capturing an image of the drive after each format. This was to facilitate examination of differences (if any) in how the tested operating systems generated the exFAT volume.
File Write Tests (A)
  1. Format a USB device using Windows 10 to employ the latest ‘official’ implementation of exFAT.
  2. Create directories associated with each tested OS on the USB flash drive
  3. Present the flash drive to each test system and then manually perform the following steps:
    1. Create file on USB device (noting time)
    2. Create file on desktop of test system (noting time)
    3. Copy file from desktop to USB device (noting time)
    4. Change Timezone (noting change details)
    5. Create file on USB device (noting time)
    6. Copy file from desktop to USB device (noting time)
    7. Revert timezone - London (UTC+1)
    8. Cut/Paste (Move) file to USB device (noting time)
The aim of this test was to establish how the different operating systems employed and modified (or didn’t) the different timestamps during these operations.

During this testing phase all actions were performed manually where possible. E.g. Using right click “new file” etc to simulate user behaviour and copy/paste within the GUI to move files. Ubuntu does not support this method of creating a file (to my knowledge) and I didn't want to rely upon any other significant applications lest they have an impact on how the OS operates. In those instances the ‘touch’ command was used on the command line to generate the test files.

File Write Tests (B)
1. Format a USB device using Windows 10 to employ the latest ‘official’ implementation of exFAT.
2. Present the flash drive to each test system and then employ scripting methods to create a textfile on the root of the drive named per the tested OS and containing a string representation of the date/time at which the action was performed.
a. E.g. for Ubuntu: date > 16_04.txt

File Write Tests (C)
Replicated File Write Tests (A) in a fully automated fashion where unexplained anomalies had been observed. Only performed for a limited number of Operating Systems and specific tests. 

The results of all test were then written to a spreadsheet where the date/time as recorded in notes could be compared to the relevant fields as they appear on disk. I then furnished the same spreadsheet with details as to how each tool I analysed displayed the timestamps, and as I am sure any reader will agree this made things immediately clear:

While the spreadsheet of doom is fairly daunting it does allow for the easy identification of patterns and anomalies, the core conclusion at this point is that there were more anomalies than consistencies and that there was a lot more work to do.

I will be exploring the different operating systems one by one in follow up posts and will go into how the relevant fields are (or are not) populated.


exFAT Timestamp Behavior Associated with Different Operating Systems

Much like my recent post on ‘Using Extended MAPI Properties to determine email sent time’ this post is a response to David Cowen’s ‘Sunday Funday' challenge as detailed over at ‘Hacking Exposed - Computer Forensics Blog’.

The question posed by David was as follows:

ExFAT is documented to have a timezone field to document which timezone a timestamp was populated with. However most tools just see it as FAT and ignore it. For this challenge document for the following operating systems how they populate ExFAT timestamps and which utility will properly show the correct values.
  • Operating systems:
  • Windows 7
  • Windows 10
  • OSX High Sierra
  • Ubuntu Linux 16.04

This sort of question/ challenge is exactly the sort of thing I enjoy. I’m always keen to learn something new and find it interesting to test how artefacts behave under various circumstances and determine the implication(s) as they relate to forensic analysis.

With that said, I went decidedly off-piste and ended up down a whole range of rabbit holes exploring odd observations and working to better my understanding of exFAT. To that end, I will be following up with a much longer, possibly even a series of posts detailing my observations and methodology.

For the moment I wanted just to share a few of the high level observations.

In preparation I undertook a significant number of tests, performing different file operations using different Operating Systems, I also expanded the tests to include Windows 8.1 and Ubuntu 18.04. The results of all test were written to a spreadsheet where the date/time as recorded in my notes could be compared to the relevant fields as they appear on disk. I then furnished the same spreadsheet with details as to how each tool I analysed displayed the timestamps, and as I am sure any reader will agree nothing makes for quick and easy conclusions like a 1332 cell Excel monstrosity. 

Unsurprisingly this took a significant amount of time to review and in all honesty it has not yet been fully mined for its potential value, hence why a more detailed post will follow. On the basis of these tests the following observations are made as it relates to each OS’s handling of the exFAT file system:

How different operating systems use the three available Date/Time fields

The first observations relate to how the operating systems employ the different time fields, different Operating Systems implement and maintain different timestamps for files depending on a number of factors and as such they employ different terminology. For the purpose of this exercise we assume the knowledge that exFAT supports the recording of 3 distinct timestamps, refereed to in this post as Created, Accessed and Last Modified. We then examine the timestamps presented within the various operating systems and correlate them to the exFAT field they draw the data from.

Operating SystemOS NameexFAT Field
OSX 10.13.3CreatedCreated
OSX 10.13.3AccessedAccessed
OSX 10.13.3ModifiedModified
OSX 10.13.3ChangedModified
Windows 7/8/10Date CreatedCreated
Windows 7/8/10Date ModifiedModified
Windows 7/8/10Date AccessedAccessed
Ubuntu 16/18.04BirthN/A
Ubuntu 16/18.04AccessedAccessed
Ubuntu 16/18.04ModifiedModified
Ubuntu 16/18.04ChangedModified

A summary of how each of the tested operating systems utilitises the various timestamp, adjustment and timezone fields is outlined below. I hasten to add that these were the results from my initial testing, most will be repeated in preparation for the detailed posts so as to confirm the more notable findings.

Windows 10

Creation Time – Recorded with centisecond granularity, the default 2 second granularity associated with the MSDOS 32bit Timestamp is complemented with an additional field which permits addition of up to 199 centiseconds.

Last Modified Time – Recorded with 2 second granularity as the additional field appears to go unused and is consistently populated with a value of 0.

Accessed Time – Recorded with 2 second granularity due to lack of support in exFAT for greater granularity.

Timezone Fields – All timestamps are recorded as a direct representation of the local time of the system, the UTC offset (as configured on the system) is recorded in the associated timezone fields.

Windows 8 and Windows 7

Creation Time – Recorded with centisecond granularity.

Last Modified Time – Recorded with centisecond granularity when this timestamp is populated as a result of a new file being created (e.g. echo data into a file or right click, new file) however when the file is created as a result of a copy or move action then the centisecond granularity is dropped. The offset is consistently set at 0 and at this stage I have to assume that the timestamp is rounded down to the nearest even second.

Accessed Time – Recorded with 2 second granularity due to lack of support in exFAT for greater granularity.

Timezone Fields – All timestamps are recorded as a direct representation of the local time of the system, the UTC offset (as configured on the system) is recorded in the associated timezone fields.

Ubuntu 16.04 and Ubuntu 18.04

Creation Time – Recorded with nearest second granularity, while exFAT supports the recording of centisecond offsets the only observed offsets used are 0 and 1.

Last Modified Time – Recorded with nearest second granularity, while exFAT supports the recording of centisecond offsets the only observed offsets used are 0 and 1.

Accessed Time – Recorded with 2 second granularity due to lack of support in exFAT for greater granularity.

Timezone Fields – All timezones are recorded in UTCThe Timezone fields are consistently set to 00, indicating that they are not in use. A

OSX 10.13.3

Creation Time – Recorded with centisecond granularity.

Last Modified Time – Recorded with centisecond granularity

Accessed Time – Recorded with 2 second granularity due to lack of support in exFAT for greater granularity.

Timezone Fields – Consistently set to “FC” which is UTC-1 and I have no idea why...

Tool Interpretation of Timestamp date

For this half of the question David left it up to us to choose which tools/utilities to test to determine “which utility will properly show the correct values”. I felt it would be most useful to provide an overview of how the major forensics suites (or at least those I have access to) display the data and how they refer to the different timestamps. To that end I set out to test how different tools presented the timestamp information associated with the same filesystem I generated in Part 1, by creating files on a single USB device. 

The following tools have been tested so far:
  • EnCase 7.12.1
  • X-Ways Forensics 19.6
  • FTK 6.2
  • TSK Autopsy 4.7.0
  • Windows 10 Native
  • Ubuntu 18.04 Native

I intend to expand on this list and provide a detailed review of the performance limitations associated with each. But for the moment the summary result is that no tool will reliably display the exFAT timestamps associated with the various tested operating systems. The tools all failed to accurately display the date an action was performed in one way or another for at least one of the operating systems. Importantly this isn't always a failure of the tool, as above the the differences in implementation of exFAT cause inconsistencies which no tool could account for.

The tools which failed the worst are those which asked the examiner to specify the source timezone, commonly these ignored the provided timezone data altogether and then adjusted all the decoded MSDOS timestamps at the whim of the user. This scenario is most challenging when an item of removable storage media has been written to by multiple different operating systems or multiple instances of the same OS but with different timezone settings configured.

One key point to mention is that one type of tool will correctly inform the examiner of the timestamp data written to disk, that is to say that tools which do not attempt to perform any interpretation can't misinterpret that data. 

One example which worked flawlessly during this analysis was to use the third party WinHex exFAT Template by Scott Pancoast (https://www.x-ways.net/winhex/templates/index.html) to display the values of the Directory Records. Another interesting point to note here is that manual review of the timezone and timestamp data associated with files on an exFAT can provide additional forensic value, for instance the inconsistencies in how the filesystem is handled and what data gets written to different fields (as detailed above) can help you suppose what OS (and maybe version) was used to write the files to disk.

Thanks to David for inspiring some research that I thoroughly enjoyed, and watch this space for the fuller analysis and findings.