Pillaging .pst Files

This post originally proposed using the open-source java program Xena and its included plugin for converting .pst files into a searchable format.  It still references Xena, but has been updated to reflect a simpler approach.

On a recent engagement we were able to quickly compromise a client’s network thanks to NetBIOS spoofing and easily cracked passwords.  Of course, the assessment didn’t stop there.  In an effort to impress upon our client the seriousness of a potential, real world, breach, we scoured their network looking for their most critical data sets.

Whereas a real world attacker is not specifically bound to a schedule, security professionals often work under tight deadlines and have limited time to devote to individual engagements.  Therefore, having efficient tactics and techniques in place is essential to make penetration quick and pillaging thorough.


On the aforementioned engagement, we used RDP to connect to a machine which happened to host backup .pst files for a large portion of the users on the target domain.  Microsoft refers to .pst files as “personal storage folders” or “Outlook data files.”  Put simply, pst files are archives of Outlook data and include not only email messages, but attachments, contacts, appointments, and other artifacts.

Without Outlook installed, the compromised machine didn’t have native support for reading the .pst files. In an effort to get at the juicy data within, we could have copied the .pst files to our Linux attack boxes and used the readpst program (which is preinstalled on Kali Linux) to convert the .psts to a “grepable” format.  However, most of the .psts were several hundred megabytes in size, and copying files of that magnitude via RDP would have been prohibitively slow and could have tripped an internal alarm.

We ended up completing the assessment without combing through the .pst files, but I was left wanting for some method of pulling useful data out of .pst files short of installing software on the client’s system.  The next time a similar circumstance presents itself, I will be at least slightly more prepared.

Xena, readpst.exe, and Portable Pillaging

After some research, my first possible solution involved using the Windows Remote Desktop Connection settings (or the “-r” switch in the Linux rdesktop command) to include a USB drive as a local resource in the remote session.  That USB drive would have a portable piece of software capable of reading or converting .pst files.  The software in this case is Xena, “Digital Preservation Software” developed by the National Archives of Australia with the intent to provide a single platform for archiving multiple data types in xml, a common, non-proprietary format.  Xena is open source and requires java to run.

One could install the PortableApps Java Runtime Environment and JAR launcher in order to run Xena from the attached USB drive without installing JRE to the target system.  However, this method is slow, and involves a lot of unnecessary fluff, when all we really need is readpst.exe.

readpst.exe is included in the Xena package here.  You can read about the author’s efforts to create this functional windows binary from the libpst library on his blog here.

readpst.exe is portable and small enough to be easily dropped from your attached USB drive or other local resource onto the target machine.  Remember, this scenario assumes the computer has already been compromised, and we have moved on to a pillaging phase.  Once readpst has been dropped on the target machine, it’s a simple matter of using the program’s command line interface to convert the targeted .pst files into easily searchable plaintext formats.

Juicy Data

Running readpst.exe without any arguments (or -h), prints the help screen.  I ran the following command:

readpst.exe –D –S –o output_dir –t ea backup.pst

“-D” tells readpst.exe to include deleted items in the coversion.

“-S” writes the emails to separate files as it converts them, which will allow us to zero in on specific pieces of data and expedite the pillaging process.

“-o” specifies an output directory.  readpst will not create the directory for you, so it must exist already.

“-t” specifies the output type; “e” for email and “a” for attachments.  I did not include “j” for journals or “c” for contacts.


As seen above, readpst will display its progress.  There may be errors, and though I have had attachments come out corrupted, the fundamental functionality has always worked.  Once readpst has converted the contents of the .pst file to plaintext documents, you can search through them using the built-in “findstr” command.

First, I searched through the sent email folder for any emails containing the string “password” using the following command:

findstr /s /i /m “password” *.*

“/s” tells findstr to search through the current directory and subdirectories.

“/i” specifies that the search should be case insensitive.

“/m” tells findstr to output the file name rather than the file contents – if we output the contents, we may quickly be swamped with output that we’ll still have to sift through.  Depending on the amount of output, you may also quickly exceed cmd.exe’s limits.

The *.*, of course, means that we’re searching through files of any name and any type.


Findstr quickly spit out a long list of files containing the string “password.”  This has narrowed the scope of my search significantly, and hopefully going through these particular files will turn up some valid credentials.

If you’re searching for data that follows a particular format, then using findstr’s regex support is your best bet.  For example, if I were searching for social security numbers, I could use the following command:

Findstr /m /r “[0-9] [0-9] [0-9]- [0-9] [0-9]- [0-9] [0-9] [0-9] [0-9]” *.*

The [0-9] tells findstr that any digit 0-9 can occupy that position in the string.


It looks like findstr was able to find one sent email with a string satisfying our regular expression.  Since there’s just one result, I’ll omit the /m switch so that findstr dumps the file contents rather than just the name:


And there you have it.  Following this method would allow us to identify the one email with PII in it out of a .pst file of over 500mb in seconds, without transferring the file or having to rely on Microsoft Outlook.

For a gruesome level of detail on findstr and its capabilities and limitations, check out this question and answer thread on stack overflow.


As is so often the case, this method is straightforward, it simply relies on one knowing what tools are out there.  I hope you find this method practical and useful.  Thanks for reading.

digby sends.