Forensic Analysis of Systems that have Windows Subsystem for Linux Installed

*** EDIT 2017-10-17 ***

Windows Subsystem for Linux was in Beta at the time of writing and all artifacts/ paths mentioned relate to installation of WSL/ 'Bash on Ubuntu on Windows', using the Beta methodology detailed in my last post. Installation of a Linux userland via the Windows Store causes notable files to be located elsewhere within Windows. This is addressed in my addendum post 'Further Forensicating of Windows Subsystem for Linux'.

*** /EDIT 2017-10-17 ***

In my last post I provided a little bit of background on Windows Subsystem for Linux ('WSL') and provided details on some of the artifacts you can look out for to identify if it has been installed and enabled. In this post we will dive into a few of the notable artifacts which speak to user activity within WSL and also some considerations when reviewing this information. The more you look the more you find, and there are a myriad of interesting edge cases you may come across. The relevance of any particular artifact will depend on your investigation and the way the suspect/user was using WSL.

In this post I will be focusing on a few specific artifacts, particularly those which relate to user activity and which have unique aspects as they relate to WSL when compared to their counterparts in a full Linux host. This is by no means a comprehensive list as the topic could and does fill multiple books, written by far smarter people than me.

Forensically Interesting Artifacts

One point to note is that where I specify an artifact location I will be referencing its location on disk as it appears when viewed in Windows or during dead box analysis this will be the path displayed in tools. Paths as they appear to the user within WSL will be different due to the various mount points that are employed, this is addressed further later in the post.

It is key to note that the installation of WSL is user specific, so there may be multiple instances of WSL installed on one system. How many you will find will depend on the number of user accounts and the number of those users who have installed a userland. My reading indicates that multiple userlands can be installed side by side, but I have yet to experiment with this. In any event, it's something to look out for and if it is indeed possible, I would assume that it will result in multiple locations to analyse per user.

Default Username

As previously mentioned, the userland installation is per user. Additionally, authentication is entirely independent of Windows authentication/ login credentials. A user is requested to define a UNIX username and password when they first run Bash. All data associated with the filesystem for the Linux userland is contained within the associated Windows users profile at %localappdata%\Lxss. In this MSDN blog post Microsoft elaborate on this, saying "Each Windows user has their own WSL environment, and can therefore have Linux root privileges and install applications without affecting other Windows users".

It should therefore be noted that irrespective of the chosen Linux username any activity identified within a particular WSL install can/should be attributed to the corresponding Windows User Account. To labour the point and make this abundantly clear, some example scenarios are outlined below, the examples are based on a Windows system with multiple users, 'UserA', 'UserB', etc:

Scenario 1: User 'UserA' installs WSL and 'Bash on Ubuntu on Windows' and sets their Linux username as 'UserA'. Based upon human nature it is likely that a significant proportion of users will use the same username and password for WSL as they do for Windows. This is probably the most common scenario you will encounter.

Scenario 2: User 'UserA' installs WSL and 'Bash on Ubuntu on Windows' however sets their UNIX username to be 'UserB'. At a glance artifacts associated with the WSL user 'UserB' may seem to relate to the Windows User 'UserB' however they should in fact be attributed to 'UserA'. For example, the .bash_history file for this user will be located at:

C:\Users\\**UserA**\AppData\Local\Lxss\home\\**UserB**\.bash_history

i.e. within the windows profile associated with UserA, the same can be said for almost all notable artifacts as they relate to WSL and user activity.

Scenario 3: Users 'UserA' and 'UserB' both install WSL and 'Bash on Ubuntu on Windows' and set their UNIX usernames as 'UserC'. These two installations are completely distinct and there is no connection between the two 'UserC' accounts despite them sharing a name.

The key point to note is that installations are user specific, independent from one another and the majority of relevant artifacts will exist within a Windows User directory or NTUSER.DAT for a specific associated Windows User Account.

The username configured at the time of installation is recorded within the NTUSER.DAT for the associated Windows user at the location below:

Key: SOFTWARE\Microsoft\Windows\CurrentVersion\Lxss
Value: DefaultUsername (RegSz)
Data: [username]

Various other registry keys were identified which relate to lxss and Windows Subsystem for Linux however the majority of notable artifacts, particularly as they relate to user activity are to be found on disk.

Prefetch Files

As mentioned in my previous post, Bash itself, as it relates to ‘Bash on Ubuntu on Windows’, is an executable located at %systemroot%\System32\bash.exe. When a user selects the ‘Bash on Ubuntu on Windows' shortcut, either from the desktop, start menu or taskbar, bash.exe is executed with an argument of ‘~’ and the user is presented with a new window placing them within Bash at the home directory of the default user. This causes the bash.exe prefetch file to be updated, assuming prefetch is enabled. This may provide some useful information as to the last and recent run times associated with the executable, which will evidence the use of Bash.

With that said, Bash can be accessed in a number of ways including by executing the 'bash' command from within the cmd prompt or from a PowerShell prompt. Neither of these execution methods cause the bash.exe prefetch file to be updated. Running the bash command from the run dialog does update the associated prefetch file.

Another detail which is notable regarding executing the ‘bash’ command from a cmd or PowerShell prompt is that this will not place the user at their WSL home directory but rather at the location which the cmd/ PowerShell prompt was previously pointing.

WSL User Accounts/ Credentials

As you might expect the passwd and shadow files will be your go to location for details of the users who are set up within a specific WSL install and their associated credentials. These are located at:

C:\Users\[Username]\AppData\Local\lxss\rootfs\etc\shadow
C:\Users\[Username]\AppData\Local\lxss\rootfs\etc\passwd

These files and their respective backups 'shadow-' and 'passwd-' can be reviewed for useful information regarding the users who have been configured within WSL. The password hashes for these accounts can be extracted from the shadow file for cracking, should this be useful in your case.

Bash History

When determining user activity on a Linux system, the bash history is often the first port of call. The .bash_history file maintains a record of a user's command history within Bash. WSL systems with Bash for Windows installed are no exception and the bash_history file for each user can be located at:

C:\Users\[Username]\AppData\Local\Lxss\home\[WSLusername]\.bash_history

and for root:

C:\Users\[username]\AppData\Local\lxss\root\.bash_history

Anyone familiar with investigating breaches of Linux systems will likely be aware that the .bash_history file is a fantastic asset during analysis, unfortunately so do the bad guys and it’s common for attempts to be made to clear it. Additionally, the Bash history is often a far from comprehensive log as it can fail to be populated under various circumstances; can behave strangely when multiple instances of bash are used simultaneously; and by default, it lacks timestamps. The behaviour of bash history under various edge cases is an interesting topic in itself and indeed it is the topic of a great presentation by Hal Pomeranz titled 'You Don't Know Jack About bash_history' a recording of which is available here.

Key points to note when reviewing the bash history under WSL:

  • Bash history file can be deleted from within Bash
  • Bash history file can be deleted from Windows
  • By default, the bash history file is only populated when a shell exits
  • Bash history file may not be populated depending on how Bash exits

Unfortunately, by default .bash_history is only populated when Bash for Windows closes cleanly. If the task is killed from task manager, or if the user simply closes the window using the close button rather than typing 'exit' then the bash_history file is not updated with the commands from that session. Windows users are somewhat accustomed to closing windows down with the close button and so I expect that bash history will be less complete for WSL when compared to a regular Linux system.

If a suspect deletes the Bash history outright, either using Bash or via Windows, there is the potential to recover the deleted file. You are likely to be at the added advantage that the file will be deleted from an NTFS filesystem as opposed to those you commonly encounter when analysing Linux systems, often making recovery easier. Further benefits of this Windows/Linux hybrid environment you are analysing include the fact that a Volume Shadow Copy may contain historical copies of the .bash_history file.

An additional area for future research will be the recovery of bash history for live or ungracefully closed sessions from RAM captured from the Windows host. It’s a fairly niche case but conceivably, if you are able to capture a RAM image of a live system where an open Bash session exists the bash history (yet to be written to disk) may be recoverable from RAM. A debatable alternative is of course to gracefully close Bash prior to performing disk imaging, causing the .bash_history to be populated with the latest sessions data, but you know your case and jurisdictional restrictions better than I do so make sure you are on the right side of them.

Other Artifacts

It would be futile to attempt to detail and discuss all relevant artifacts within this post. Artifacts such as the files contained within '.ssh' or ‘.gnupg' or values in '.viminfo' (which may be relevant depending on your case) can be found where you would expect them. Various log files, with some notable exceptions (such as syslog) are to be found within the same location as you might expect to find them in an Ubuntu install.

One key area which has impact on analysis is the way WSL interacts with the host Windows file system and vice versa.

Filesystem Interaction (Accessing the Windows filesystem from WSL)

The first point to note about filesystem interaction is that the root filesystem for WSL exists within a directory on the local OS volume which will commonly be formatted with NTFS. This has significant implications in forensic analysis, including what metadata is stored for files and the likelihood of recovering deleted data. As alluded to earlier, the full directory structure associated with WSL is subject to Volume Shadow Copy and as such historical copies of files may be recoverable from that source, I have successfully used this to recover bash history that was otherwise unavailable.

Fixed storage devices associated with the host filesystem are presented to WSL as mounted devices at /mnt/[driveletter], and by default the 'noatime' attribute as set. e.g.:

Example content of /proc/mounts:

rootfs / lxfs rw,noatime 0 0
data /data lxfs rw,noatime 0 0
cache /cache lxfs rw,noatime 0 0
mnt /mnt lxfs rw,noatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,noatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,noatime 0 0
none /dev tmpfs rw,noatime,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,noatime 0 0
none /run tmpfs rw,nosuid,noexec,noatime,mode=755 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,noatime 0 0
none /run/shm tmpfs rw,nosuid,nodev,noatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,noatime,mode=755 0 0
C: /mnt/c drvfs rw,noatime 0 0
D: /mnt/d drvfs rw,noatime 0 0
root /root lxfs rw,noatime 0 0
home /home lxfs rw,noatime 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,noatime 0 0

Prior to writing this post I was under the impression that to access a volume from within WSL it had to be formatted as NTFS or ReFS. However, per this MSDN blog from April, it appears support for additional filesystems has been added, although the requisite update is only available via the Windows 10 Insider Preview at the moment. In any event, I will focus on NTFS as in the majority of cases this is likely the file system onto which WSL will have been installed.

The way Microsoft have implemented the filesystem, so as to emulate Linux behaviour while the data actually resides upon an NTFS or ReFS volume is an interesting topic, and one which is covered well in an MSDN blog post here. To quote and paraphrase: "WSL provides access to Windows files by emulating full Linux behavior for the internal Linux file system with VolFs, and by providing full access to Windows drives and files through DrvFs. As of this writing, DrvFs enables some of the functionality of Linux file systems, such as case sensitivity and symbolic links, while still supporting interoperability with Windows".

"When opening a file in DrvFs, Windows permissions are used based on the token of the user that executed bash.exe. So in order to access files under C:\Windows, it’s not enough to use “sudo” in your bash environment, which gives you root privileges in WSL but does not alter your Windows user token. Instead, you would have to launch bash.exe elevated to gain the appropriate permissions.

"VolFs is used to mount the VFS root directory, using %LocalAppData%\lxss\rootfs as the backing storage. In addition, a few additional VolFs mount points exist, most notably /root and /home which are mounted using %LocalAppData%\lxss\root and %LocalAppData%\lxss\home respectively. The reason for these separate mounts is that when you uninstall WSL, the home directories are not removed by default, so any personal files stored there will be preserved."

The architecture of VFS within WSL as used to facilitate interoperability is conveyed in the below diagram which is taken from the same blog post:


https://blogs.msdn.microsoft.com/wsl/2016/06/15/wsl-file-system-support/

The really useful MSDN blog posts by Jack Hammons as they relate to WSL should be considered compulsory reading if you find yourself performing a complex analysis of WSL.

This approach means that the full host file system (assuming fixed NTFS/ ReFS volumes) is accessible from within WSL. Linux tools can be used to view, modify and delete data/files on the fixed disks without leaving the traces we might expect to find were a user to perform similar actions via Windows Explorer. The facility to view, and modify data without leaving traditional traces may be attractive to malicious users. Additionally, commands such as 'touch' and 'shred', which are available by default, may be tempting to the savvy WSL user who is looking to perform antiforensics.

The fact that the WSL files exist within an NTFS filename not only impacts the likelihood of recovery but it also creates some interesting scenarios. One such example is that it is possible to create files using Linux's case sensitivity which have filenames which are identical, other than their case. Per the below screenshot, it is possible to use WSL to create files which explorer would not normally permit:


Three files were created using Windows Subsystem for Linux

While this is possible via WSL, various windows applications aren't going to like it. Windows can in fact support case sensitivity, it is supported in NTFS and can be switched on by modifying the HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\kernel\ dword:ObCaseInsensitive registry value. But by default, this support isn't enabled and irrespective of this WSL will allow case sensitive filenames to be created as per the screenshot. Attempting to copy the files to another location causes errors.

This hybrid filesystem approach also has an interesting (and forensically useful) implication for file metadata. Recording file creation (or birth) is not supported by many common Linux file systems and indeed many Linux applications lack support for reading this attribute despite support being introduced in some modern file systems such as ext4. However, the WSL file system exists within an NTFS volume and despite the fact that it cannot be queried or modified from within WSL, a file created timestamp is recorded within the MFT for files created using WSL. This could prove forensically useful in all manner of cases.

Filesystem Interaction (Accessing the WSL filesystem from Windows)

As detailed above, VolFS and DrvFS are employed to mount different parts of the WSL file system and this has implications on file behaviour between these two locations. Specifically, VolFs is used to mount the VFS root directory, /root and /home while DrvFS is used for any local fixed drives as well as removable media and network shares with the addition of support for these.

Focusing on the core WSL file structure as located within %LocalAppData%\lxss, direct manipulation (from Windows) of data within that directory structure will not necessarily be reflected within WSL. Windows does not have a concept of inodes and as such VolFs (in an effort to provide support for most VFS features) is required to maintain a separate record of inodes and their associated Windows file objects. If a Windows application (e.g. Explorer or Notepad) are used to create a file in a WSL home directory then that file will not be visible to the WSL user because VolFs was not employed in the creation of the file so certain attributes are missing, causing VolFS to ignore it when access is attempted from witin WSL.

The same occurs with file modification, if a file within a VolFs mounted file system is modified from Windows the MFT metadata will be updated to reflect the new last modified time, however the stat command within WSL will show the original values. Viewing the contents of the file MAY show the correct updated content as the file is accessed on disk (or you may experience corruption/ the files may disappear from view). The reason for this is detailed by Microsoft as such:

"While VolFs files are stored in regular files on Windows in the directories mentioned above, interoperability with Windows is not supported. If a new file is added to one of these directories from Windows, it lacks the EAs needed by VolFs, so VolFs doesn’t know what to do with the file and simply ignores it. Many editors will also strip the EAs when saving an existing file, again making the file unusable in WSL.
Additionally, since VFS caches directory entries, any modifications to those directories that are made from Windows while WSL is running may not be accurately reflected."

The various attributes which are not natively supported by NTFS but are required by VolFs to best mimic Linux file system behavior are stored in NTFS Extended Attributes associated with the associated file. I'm pleased to report that the previously mentioned Microsoft MSDN blogs go into reasonable detail on this, going so far as to detail which information is stored in EAs, reproduced below:

  • "Mode: this includes the file type (regular, symlink, FIFO, etc.) and the permission bits for the file.
  • Owner: the user ID and group ID of the Linux user and group that own the file.
  • Device ID: for device files, the device major and minor number of the device. Note that WSL currently does not allow users to create device files on VolFs.
  • File times: the file accessed, modified and changed times on Linux use a different format and granularity than on Windows, so these are also stored in the EAs
  • In addition, if a file has any file capabilities, these are stored in an alternate data stream for the file. Note that WSL currently does not allow users to modify file capabilities for a file.The remaining inode attributes, such as inode number and file size, are derived from information kept by NTFS."

There is therefore the potential to examine the Extended Attributes to determine the Linux metadata associated with individual files. I say the potential, because fully documenting the way Extended Attributes are used by the Windows Subsystem for Linux is beyond the scope of this post. A screenshot of an example file is provided below:

Historical Windows Subsystem for Linux data

One final takeaway from my various reading was the obviously conscious decision by the WSL development team to separate the rootfs, root and home filesystem and mount them independently. This is evidently done to facilitate the preservation of user data if/when WSL is uninstalled. This preservation is the default behaviour and as such when analysing a system which has had WSL disabled/ uninstalled there are likely to be potential sources of evidence in the home directories which have not been removed (assuming the user hasn't gone out of their way to remove them).

There is a shedload of useful information on the MSDN blog regarding WSL. Much of which I only discovered after hours of tinkering, but this link blogs.msdn.microsoft.com/wsl will be invaluable if you find yourself having to analyse a system where WSL was used by a suspect.