In digital forensics, file system analysis is a precursor task to event reconstruction. Often, unallocated content within a file system is content of interest to an investigation, and thus recognition, extraction, and ascription of unallocated files are typical intermediary steps en route to interpreting file system contents. The results of this general workflow form comprise a set of intermediary results worth cross-verification, due to potential impact on later interpretations of initial evidence. However, unallocated files often lack stable identifiers, presenting subtle challenges that can foil algorithmic comparison. Unallocated content recovery requires careful understanding of the storage medium, or storage format, from which the content is recovered (Casey et al., 2019). This work focuses on a model of a file system where an allocated file's definition comprises at least three (Carrier, 2005) key dimensions: The data structure housing its metadata such as timestamp and owner, often referred to as an inode; the data structure compactly housing its location within the file system's namespace, often implemented as a directory entry; and the range within the file system that houses its contents, which might be discontiguous. Prior work has used this model to implement differential analysis, both for comparing changes in a file system's state across time (Garfinkel et al., 2012), and across parse results when using multiple tools against the same subject image (Nelson et al., 2014). While the three-dimensional file model enables comparison of allocated content with seemingly little difficulty, some attempts to verify some POSIX-required characteristics of the allocated content show a weakness in the three-dimensional model that impacts interpretation of allocated and unallocated files. We present a strengthening of the three-dimensional model, emphasizing a geometric representation of the three file dimensions as a first-order concern. This pattern extends in applicability beyond file system analysis, but is presented initially in the context of New Technology File System (NTFS) file system analysis. We demonstrate corrections over a model improvement previously proposed by Casey et al. (2019), and show results from extending two independently-developed open source tools to enable geometric comparability between their NTFS results. Using the tool-agnostic languages Digital Forensics XML (DFXML) (Garfinkel, 2012) and Cyber-investigation Standardized Analysis and Expression (CASE) (Casey et al., 2017), the geometry-based identifier strategy corrects a previous measurement of unallocated content.
In digital forensics, file system analysis is a precursor task to event reconstruction. Often, unallocated content within a file system is content of interest to an investigation, and thus recognition, extraction, and ascription of unallocated files are typical intermediary steps en route to...
See full abstract
In digital forensics, file system analysis is a precursor task to event reconstruction. Often, unallocated content within a file system is content of interest to an investigation, and thus recognition, extraction, and ascription of unallocated files are typical intermediary steps en route to interpreting file system contents. The results of this general workflow form comprise a set of intermediary results worth cross-verification, due to potential impact on later interpretations of initial evidence. However, unallocated files often lack stable identifiers, presenting subtle challenges that can foil algorithmic comparison. Unallocated content recovery requires careful understanding of the storage medium, or storage format, from which the content is recovered (Casey et al., 2019). This work focuses on a model of a file system where an allocated file's definition comprises at least three (Carrier, 2005) key dimensions: The data structure housing its metadata such as timestamp and owner, often referred to as an inode; the data structure compactly housing its location within the file system's namespace, often implemented as a directory entry; and the range within the file system that houses its contents, which might be discontiguous. Prior work has used this model to implement differential analysis, both for comparing changes in a file system's state across time (Garfinkel et al., 2012), and across parse results when using multiple tools against the same subject image (Nelson et al., 2014). While the three-dimensional file model enables comparison of allocated content with seemingly little difficulty, some attempts to verify some POSIX-required characteristics of the allocated content show a weakness in the three-dimensional model that impacts interpretation of allocated and unallocated files. We present a strengthening of the three-dimensional model, emphasizing a geometric representation of the three file dimensions as a first-order concern. This pattern extends in applicability beyond file system analysis, but is presented initially in the context of New Technology File System (NTFS) file system analysis. We demonstrate corrections over a model improvement previously proposed by Casey et al. (2019), and show results from extending two independently-developed open source tools to enable geometric comparability between their NTFS results. Using the tool-agnostic languages Digital Forensics XML (DFXML) (Garfinkel, 2012) and Cyber-investigation Standardized Analysis and Expression (CASE) (Casey et al., 2017), the geometry-based identifier strategy corrects a previous measurement of unallocated content.
Hide full abstract