Sorting the WikiLeaks DNC Emails

Sorting the WikiLeaks DNC Emails

In this report, we review the DNC email collection published by Wikileaks. We attribute each email to one of ten (10) DNC staffers. This is new research – some journalists and researchers have suggested that the WikiLeaks DNC email collection disclosed the emails of ten staffers, but this report is the first to provide detailed attribution.

We use this attribution of particular emails to DNC staffers to build an email acquisition timeline. The timeline that we develop stands at odds with statements made in the DOJ indictment of twelve (12) Russian intel (GRU) officers. The indictment timeline does not account for over two-thirds of the DNC email collection. We also observe that the indictment implies connections between various facts, but seldom makes specific definitive statements that might be derived from those facts.

For example, the indictment introduces the idea that a “1Gb or so” archive was transmitted from Guccifer 2 to WikiLeaks and gives the impression that this archive might have been the source of the WikiLeaks DNC email publications but never states this as fact. We show that this Zip file is too small to hold the entire DNC email collection, which rules it out as the source of the WikiLeaks DNC emails.

Updates

2019-12-20: Fix the ex-filtration timeline chart. The May 25 ex-filtration occurred between 8:21 AM and 9:04 AM (7 hours later than shown in the original chart).

2019-05-06: As further support of our findings, we refer the reader to a new, extensive analysis of the Mueller report, authored by Adam Carter, titled The Mueller Report – Expensive Estimations And Elusive Evidence.

2019-04-22: Merged in material from A Closer Look at Guccifer 2’s DNC Email Attachments, which analyzed the seventeen (17) documents that Guccifer 2 posted that match attachments in the WikiLeaks DNC email collection.

2019-04-18: A few typos in a chart and table were fixed along with some minor textual changes. Thanks go to @JasonA_Ross for his suggestions and corrections.

Conclusions

In a section near the end of this document, we analyze statements made in the Mueller report, regarding the alleged hack of DNC emails. We conclude:

  • Apparently, the Special Counsel’s investigators have no proof that the indicted GRU officers actually stole the DNC emails, or that those same emails were the source of the DNC emails published by WikiLeaks.  Further, no mention is made of the second (Nov 6) WikiLeaks release of DNC emails (which was roughly equal in number to the first).  When we add our observation that over two-thirds of the DNC emails were acquired on May 23 (not May 25 through June 1 as stated in the GRU indictment) we conclude that the Special Counsel’s allegations lack merit.

Excerpts from the July 13, 2018 indictment of 12 GRU agents follow. (Emphasis added.)

29. Between on or about May 25, 2016 and June 1, 2016, the Conspirators hacked the DNC Microsoft Exchange Server and stole thousands of emails from the work accounts of DNC employees. During that time, YERMAKOV researched PowerShell commands related to accessing and managing the Microsoft Exchange Server.

47 (b). After failed attempts to transfer the stolen documents starting in late June 2016, on or about July 14, 2016, the Conspirators, posing as Guccifer 2.0, sent Organization 1 [WikiLeaks] an email with an attachment titled “wk dnc linkl.txt.gpg,” The Conspirators explained to Organization 1 that the encrypted file contained instructions on how to access an online archive of stolen DNC documents. On or about July 18, 2016, Organization 1 [WikiLeaks] confirmed it had “the 1Gb or so archive” and would make a release of the stolen documents “this week.”

48. On or about July 22, 2016, Organization 1 [WikiLeaks] released over 20,000 emails and other documents stolen from the DNC network by the Conspirators. This release occurred approximately three days before the start of the Democratic National Convention. Organization 1 did not disclose Guccifer 2.0’s role in providing them. The latest-in-time email, released through Organization 1 was dated on or about May 25, 2016, approximately the same day the Conspirators hacked the DNC Microsoft Exchange Server.

A few observations and comments:

  • The indictment makes no mention of the May 23 ex-filtration of DNC emails (which contributes over two-thirds of the WikiLeaks DNC email collection).  The indictment’s timeline starts on May 25.
  • Based on our analysis of the DNC emails, we see no signs of activity in the May 26 through June 1 timeframe mentioned in the indictment.  The indictment offers no specifics on what may have transpired from May 26 through June 1.
  • The indictment’s statement – “[the] same day [May 25] the Conspirators hacked the DNC Microsoft Exchange Server” implies only a single hacking event, apparently ignoring evidence that 70% of the emails were acquired on May 23.
  • “Approximately the same day” in combination with the statement “between [..] May 25, 2016 and June 1, 2016” leaves open the possibility that a separate hack of the Exchange server may have happened on or after May 25 – this event is hypothetically the subject of the indictment.  We see no evidence of a third acquisition event when we analyze the WikiLeaks DNC email collection.  Yet its existence would explain the apparent discrepancy between our observations and the statements made in the indictment.
  • The indictment says “During that time, YERMAKOV researched PowerShell commands related to accessing and managing the Microsoft Exchange Server.”  (Emphasis added.)  If “that time” is May 25 through June 1, how would this explain the ex-filtration of the DNC emails that were collected on May 23, which make up over two-thirds of the DNC emails published by WikiLeaks?
  • The indictment offers no rationale on why those particular ten (10) individuals who appear in the WikiLeaks DNC email collection were chosen; it doesn’t mention any specific individuals.
  • Wouldn’t the DNC executive staff (DWS and Brazile, for example) have been higher value targets?  Why the emphasis on Finance?  Why not system administration, where we might have learned about the DNC’s coordination with Crowdstrike and/or the FBI?
  • The indictment doesn’t tell us how the perpetrators gained access to the DNC Exchange server or their level of access.   In a properly configured system, administrative privileges would be required to access individual mailboxes on the Exchange server.  On some systems, mail administrators need additional privileges in order to access the Exchange server.
  • Although the indictment states that Yermakov “researched PowerShell commands related to accessing and managing the Microsoft Exchange Server”, it does not say exactly how the emails were exported (or even if PowerShell was used).
  • The  indictment mentions several separate events, but does not pull them together into a succinct claim: (1) failed attempts by Guccifer 2.0 to deliver DNC documents to DNC, (2) transmission of a “1Gb or so archive” to WikiLeaks, (3) A statement that is a summary but not a direct quote that says “[WikiLeaks] would make a release of the stolen documents” followed by the quoted phrase “this week”, (4) “On or about July 22, 2016, Organization 1 [WikiLeaks] released over 20,000 emails and other documents stolen from the DNC network by the Conspirators”, (5)  “Organization 1 did not disclose Guccifer 2.0’s role in providing them”.
  • Although a connect-the-dots interpretation suggests that the DNC emails were the “documents” provided by Guccifer 2, nowhere does the indictment make a simple, clear claim along the lines “On July 22, 2016 Company 1 [WikiLeaks], published the emails taken by the Conspirators on May 25, when they hacked the DNC Exchange Server.”  Further the statement that WikiLeaks “did not disclose Guccifer 2.0’s role in providing them”, alludes to the idea that Guccifer 2.0 provided them, but the indictment never states this as fact.
  • The indictment makes no mention of the second and last WikiLeaks document release, published on November 6, 2016.  As we have shown, this DNC email release was approximately equal in size and document count to the first email dump published on July 22, 2016.  All the documents in this last November 6 release appear to have been acquired on May 23 (not May 25 as claimed in the indictment).
  • The indictment describes an encrypted email attachment sent to WikiLeaks by Guccifer 2 on July 14, 2016.  We are told that “[this] encrypted file contained instructions on how to access an online archive of stolen DNC documents”.  If this file were encrypted with WikiLeaks’ public key, we wonder if the Special Counsel’s investigators had the capability of decrypting this attachment, or if they were speculating about its contents?
  • The “1Gb or so” archive mentioned in the indictment was too small to hold the entire DNC email collection.  Further, although the indictment implies that this Zip file may have contained the DNC emails, the indictment never directly states that this Zip file was the source of the DNC email collection.

Credits

This article builds on recent analysis done by William Binney (a past NSA director and whistleblower) and Larry Johnson (a former CIA and State Department Counter Terrorism officer).  Their report is titled Why the DNC was not Hacked by the Russians (Feb. 2019).  Binney and the Veteran Intelligence Professionals for Sanity (VIPS) published a follow up article, VIPS: Mueller’s Forensics-Free Findings (March 2019) which summarizes their evidence based challenge to the mainstream narrative.  In their words, “We veteran intelligence professionals (VIPS) have done enough detailed forensic work to prove the speciousness of the prevailing story that the DNC emails published by WikiLeaks came from Russian hacking.”

Further , this report draws extensively on Adam Carter’s recent research described in FAT Anomalies In Leaked DNC Emails Suggest Use Of Thumbdrive (Feb. 2019).  A Twitter user, @steemwh1sks, documents the results of his related research in this GitHub gist.  Stephen McIntyre (@ClimateAudit) pens a blog post, DNC Hack due to Gmail Phishing?? (March, 2018), which challenges the hastily alleged link (made by the mainstream media) between a Gmail phishing attack and the alleged DNC hack.

Background

On July 22, 2016 WikiLeaks published its first batch of DNC emails.

Although we are told above the names of the DNC staffers who had their emails published, questions have remained about the organization of the WikiLeaks DNC email collection.  In this article, we take a detailed look into the DNC email collection.  We will assign each mail message to a DNC staffer and identify the staffers associated with the second batch of emails (published on Nov. 6, 2016).  With this knowledge in hand, we can make some interesting observations.  We speculate on when and how the DNC emails were acquired, exported as EML files, and then finally uploaded to WikiLeaks.

A few months later, WikiLeaks published their second and final batch of DNC emails just days ahead of the 2016 election.  This almost doubled the number of published DNC emails.  The announcement was minimally updated (a couple of weeks after the release) – no additional DNC staffers were mentioned.

In their Twitter post, WikiLeaks announces this second release.

WikiLeaks suggests that this second and final release has only 8,263 emails. We demonstrate in our analysis that their quoted number likely excludes duplicates and may not include the files that were uploaded to the WikiLeaks site on Aug. 26.

The total count, above, is 44,053 which agrees with the figure stated on the WikiLeaks web site.  By our calculation, the second batch had 21,597 total email messages (inclusive of duplicates).  The first batch was of similar size; it had 22,456 messages.  Although equal in size, this second batch had little to no political impact, perhaps because it was released so close to the election.

In the table above, we see that the first release included email message files (with .eml extensions) with last modification dates of 2016-05-23 and 2016-05-25.  The second release has last modification times of 2016-08-26 and 2016-09-21.  The May 23 and May 25 dates have particular relevance because they appear to be the dates that the emails were acquired and then exported as .eml files. We will explore this aspect in detail below, but first let’s turn to attribution.

How to Identify Local, External, and Sent Folder Emails Using their Header Information

The DNC Email collection is made up of individual EML files.  The EML files are text files which are heavily encoded.  Their structure is defined by various standards; some of these standards date back to the 1970’s.  You can see the EML file on WikiLeaks when you click on the “View source” tab.  Here is an example.  Before discussing the header information, notice that this appears to be a local email where the system administrator, Yared Tamene, has used the trick of sending the email to himself with a likely long list of recipients or internal email distribution lists on the “Bcc:” line.

After clicking the “View source” tab, we see the raw EML file.

Above, we have an example of an email that is delivered locally.  It has a single “Received:” line for the DNC Microsoft Exchange server.

NOTE: Since Tamene is sending this email to himself and probably bcc-ing the world, we cannot assign this email to a particular DNC staffer based on just the information in this EML file.

Let’s turn to an example of an email that is saved in the “Sent” folder of the individual sending an email from DNC’s mail server.  We choose an email with a low numbered mail id.  Note: The mail ids used by WikiLeaks correspond to the names of the EML files saved on the WikiLeaks web site.

Here we see an email from one of the DNC staffers that we’re interested in.  In this email, Kaplan is sending email to Bobby Shmuck, who worked in the Executive Office of the White House.  This is external to the DNC, obviously, and Mr. Shmuck isn’t one of the DNC staffers of interest.  On this basis alone, we could tentatively assign this email to Jordan Kaplan.  The raw EML file will be more definitive.

There is no Received: header.  We see this on emails that are copied to the Sent folder by the Exchange server.  Thus, we can definitively say this was sent by Jordan Kaplan, no matter whether there are ten recipients or just one.  This indication alone can take us far in our attribution quest, but before continuing down that path, let’s turn to the remaining email type: incoming email from a source external to the DNC.

This message appears to be a daily mailing from the Washington Post, which is external to the DNC.  In this presentation, we have anonymized Kaplan’s email address as jk@dnc.org, which presumably is invalid.  Here again, with Kaplan as the only recipient, we might safely attribute this email to Kaplan. Let’s turn to the raw EML file for further confirmation.

Above, we see that all email coming in from a source external to the DNC was sent first to Appriver for spam filtering.  Appriver will add an “X-Primary:” field which gives us further confirmation that this email was intended for Kaplan.  One glitch, however, is that if this email had been sent to a mailing list (for example) Appriver seems to just pick one of the recipients as the “primary”.  The attribution is stronger, however, if there is only a single recipient (as in this case) and it matches this “X-Primary:” field.

NOTE: In addition to spam filtering, Appriver will hold email destined for the DNC, if the DNC email server happens to be offline or is not reachable.   This will be relevant to a line of analysis discussed later in this report.

First Cut Attribution: The Sent Folder

We mentioned earlier, emails copied to the “Sent” folder do not have a “Received:” header field; about 1 out of 8 messages meet this criterion. Since “Sent” folders are uniquely associated with individuals, we can use this indication to assign attribution.

Here, we see that our tentative groups do not overlap and their end points are close together.  Our “Sent folder” heuristic is effective.  We are fortunate that although the id’s are randomized within their groups, they are not randomized across groups.  The first seven individuals listed correspond to those named in the first WikiLeaks release.  Brinster, Crystal and Banfill are new arrivals and are unique to the second (Nov. 2016) release.

We can short-circuit the attribution for the first seven individuals disclosed in the first WikiLeaks release publication.   There, WikiLeaks provided message counts for each individual.   If we correlate those message counts with the ordering shown above, we quickly close on a solution.

This shortcut won’t help us for the second (Nov. 2016) WikiLeaks release.  There, no message counts were disclosed.  Instead, we will have to work a little harder and we will see a somewhat complicated arrangement arise.  As we have noted, the EML files have last modified times on the WikiLeaks server, which lets us arrange them in chronological order.  The two (2) second granularity of many of the last modified times produces many entries with the same last modified time; we further refine that ordering by using the ID number to break ties.  As it turns out, when we traverse this timeline we can re-construct the batches that were exported to EML files and then copied to WikiLeaks to build the final collection.

Proceeding along the EML timeline, we use our Sent folder attribution, and extend it with these heuristics; we continue with our current attribution if any of the following are true:

  • The current email attribution is seen as a single email recipient.
  • The current email attribution is listed in the “X-Primary:” field.
  • As we develop information about id ranges, use that range information to attribute any messages that fall into the range as being attributed to the individual.

This method gets us pretty close; some additional manual fine tuning is needed to complete the picture.

The DNC Emails were Copied in Batches

Walking the EML timeline, applying the methods described above, we observe that the DNC emails were copied/exported in various batches.

The first seven batches are contiguous in time; they were published in the first WikiLeaks release.  The second release (batches #08 through #13) show back filling for Brinster and Banfill, though Crystal was copied in one batch.  This backfilling is consistent with an observation made by @steemwh1sks on this Twitter thread.  In that thread, wh1sks refers to a New Yorker article (Aug 2017), authored by Raffi Khatchadourian, titled Julian Assange, a Man Without a Country  [archive].

When we apply the attribution of message ID’s to particular DNC staffers, we see that each and every message ID is accounted for.

Who had the Most Damaging Emails?

Using a list of 40 “most damaging” emails tabulated [archive] by The Gateway Pundit (hat tip: Reddit), we totaled up the number of times that those emails showed up for each DNC staffer in the WikiLeaks email dump.  The mailbox of Luis Miranda had the vast majority of citations.  We did not try to rank the emails based on their potential level of impact.  Miranda’s mailbox was acquired on May 23, 2016 and appeared prominently in the first WikiLeaks document dump on July 22, 2016.

Acquisition, Export, Copy, Upload

We view the process that went from individual Exchange inboxes to EML files uploaded to the WikiLeaks as progressing in these stages.   We offer these stages as support for the analysis that follows; there is insufficient evidence to conclude these stages occurred exactly as shown.

  • Acquisition The individual email accounts are accessed.
  • Export The individual emails are saved as EML files.
  • Vetting Prior to establishing ID ranges, some EML files could have been reviewed – with problematic emails being removed.  This process, if it occurred at all, may have been done when the EML files were exported and/or when WikiLeaks published them.  We see no evidence of vetting or curating, but cannot rule this step out, either.
  • Copying in Batches Generally, on a per-individual basis after acquisition and export, the EML files may have been copied.  Based on our review of the May 25 timeline (and the Luis Miranda May 23 timeline), we see no evidence of a separate, distinct, copy operation (that affected the ordering of their last modification times) for those batches (#01 through #07).

Emails Were Acquired on May 23 and May 25 (2016) – 70% Were Acquired on May 23

If we look inside an EML file, we will see a “Date:” field which records the date/time the message was sent.  Its accuracy depends on the accuracy of the sender’s clock.  We have run some tests regarding the accuracy of the senders’ clocks and generally they seem quite accurate.  To do this, we compared the Sent times against the time recorded by the DNC server upon receipt.  In our view, the last sent time can be viewed as a proxy for the earliest possible acquisition date/time.  Those values are shown below.

Data Rates for the EML files dated May 23 and May 25 Support Our Conclusion that the EML Files were Acquired and Exported Simultaneously on those Dates

The May 23 and May 25 Acquisitions Were Contemporaneous With Export

Emails have an internal “Sent time”, which we derive from the value of the sender’s “Date:” field.  When we normalize those sent times to GMT time and do the same for the EML file last modified times, we see groups of messages where their sent times are close to the EML last modified times.  From this, we conclude that the email acquisition operation occurred approximately at the same time as they were exported to EML file format.  In a following section, we provide more compelling support for this conclusion.

The May 25 EML Files Have Pacific Timezone Indications

Some of the May 25 emails have internal Sent times that are greater than the last modified times of their corresponding EML files (when both are normalized to GMT).   This is a surprising result: EML files are derived from email messages (the email has to arrive before it can be exported to an EML file).  Therefore, we expect the last modified times of the EML files to be greater than Sent time of their precursor email.

We demonstrate below, that after adding back a 7 hour offset, the EML files adopt their expected relationship to their underlying emails.

Note above, after adjustment, the earliest dated EML file (08:29:38) precedes the latest dated email (08:48:34).  In the following section, we will establish a timeline and explain how this is possible.

The May 25 Emails Were Likely Acquired and Exported Simultaneously (and not Vetted)

After adjusting the EML last modification times (then translating all times to EDT), we establish the following timeline.

Above, we see that email arrivals are interleaved with email export operations; yet, when an export operation starts, all of its precursor emails are available. We cannot be certain that the events we have labeled as export operations are in fact not just copy/transfer operations, but the overall pattern is compelling.

Working off our observation that acquisition and export likely occurred simultaneously,  we conclude that the transfer speeds that we noted previously are in fact the speed of the export (to EML) operation.  Further, the smooth nature of those transfers suggests that this export operation (applied to several individual mailboxes) was likely driven by a script, or from a program/GUI that had the capability to select several mailboxes at once for export.  Further, the emails were likely not vetted, when acquired and exported; the timeline doesn’t appear allow for that.

The May 23 EML Files Were Likely Also Acquired and Exported Simultaneously

Although over two-thirds of the DNC emails were acquired on May 23, only the Louis Miranda emails, (which were acquired on May 23), made it into the WikiLeaks first release.  The rest of the May 23 emails show up in the Nov. 6 second and last release, but their last modified times were not preserved, so they can’t be analyzed further.  For the Miranda emails, we see indications that those EML files were likely also acquired and exported simultaneously.  The similar transfer rates of the Miranda emails and the May 25 acquisition operations add further confirmation.

The First WikiLeaks DNC Email Release was Likely not Vetted or Curated

Given the observation that the emails were likely acquired and exported simultaneously, combined with the uniform nature of the EML last modified time timelines, we think it is likely that those collections appear unaltered and reflect their state when they were acquired.  There is room for doubt, however, because the two second timestamp granularity limits our ability to make this determination.

The WikiLeaks DNC Emails Have ID’s that are Contiguous but are Randomized within each Group Associated with a Single Individual

We have shown above that ID’s were assigned sequentially across each group associated with a single individual.  We also ran some statistical tests that confirm the ID’s are assigned randomly within each group, yet last modified times were preserved for the emails in the first WikiLeaks release.  We have no theory on why things were done this way, other than if ID’s had been preserved we might have been able to reach stronger conclusions on whether any emails had been removed or could make assertions about the tools used to acquire and export the emails.  We note that if the ID’s had been randomized across the entire collection it would have been impossible to attribute groups of ID’s to specific individuals.

Evidence of System Downtime or Lack of Connectivity in the Email Headers

We observed earlier that the DNC routed all of its incoming email (from an external source) through a provider named Appriver.  Appriver performed spam filtering services and it also improved the DNC’s email availability, because Appriver would hold incoming emails in the event that the DNC email server is unreachable.  We can see artifacts of this activity in the DNC email headers.  Based on the observation, below, we confirm that the DNC email system (per Tamene, all DNC systems) were offline from 10 PM on May 23 to 4 AM of May 24.  Appriver appears to have held the emails for an additional 4 hours; they start to flow in at 8 AM.  This may have something to do with Appriver’s polling interval.

Separately, in an email dated May 3, 2016 the system administrator (Yared Tamene) mentions that the DNC systems will be unavailable starting 10 PM May 4 through early AM on May 5.   Using our Appriver hold time calculation, we don’t see this advertised downtime; perhaps the email server (and its connectivity) was unaffected (or it was rebooted quickly).  Readers may recall that Crowdstrike was paid on May 5 by the DNC for services rendered (approximately $9,000 per FEC filings).  Was the May 5 planned downtime reserved for Crowdstrike’s initial investigation and/or intervention?

The DNC Email Acquisition Timeline

The May 24 system down time stands directly between the May 23 acquisition of DNC emails and the May 25 acquisition.  The down time was hurriedly announced in the early AM of May 23, just hours after the May 23 dated files had been acquired.  Tamene announced that all systems will be offline beginning at 10 PM.  The following table summarizes the timeline.

Anomaly: the EML Files Exported on May 25 Exceed the 30 Day Retention Period

Various researchers have cited this DNC email where it states that DNC emails are retained for only 30 days, unless those emails are moved into a mail folder.  An unasked question is: When was this 30 day retention rule implemented?

When we sort the emails by their earliest Sent times – we see below that there are many emails that exceed the 30 day retention period.  Interestingly, only the emails ex-filtrated on May 25 exhibit this behavior.  We have no theory to explain this observation.  It stands at odds with the email above that says emails are retained for only 30 days.  We note that less than 3% of the total emails exceed the 30 day period (for the emails exported on May 25).

For Jeremy Brinster, we cannot exclude the possibility that his email may have been acquired in two (or more) batches with the earliest acquisition being circa 2016-05-18.  Brinster’s EML files have a last modified date of August 26, which is clearly not the ex-filtration date; this obscures our ability to determine if the EML files were exported in two/more batches.

The “1Gb or so” Archive

The GRU indictment states: On or about July 18, 2016, Organization 1 confirmed it had “the 1Gb or so archive” and would make a release of the stolen documents “this week.” We are not told specifically about the exact contents of this archive, or even if this archive has the “stolen documents” that will be released “this week”.  Certainly, the indictment leads us to believe that this archive held the DNC emails that will be released just four (4) days later (July 22, 2016), but it never directly asserts this.

We can put this “1Gb or so” characterization to the test.  Below, we zipped (1) the July 22 release (rel-1), (2) the November 6 release (rel-2), and (3) the full collection (all).  We tried both 7zip and regular Zip formats and displayed the size results using both a base 10 Gigabyte value (GB) and a base 2 value (GiB).  The indictment doesn’t mention the second release, or the entire collection; both are shown for completeness.

Here, we see that when the first release is packed into a conventional Zip file it fits into a “1Gb or so archive”.  Thus, we can’t exclude (out of hand) the possibility that the archive file mentioned in the indictment might have been a Zip file containing the first DNC email release.

This seems to fit well with the indictment’s narrative, but not when we recall the New Yorker article, mentioned earlier.  The article states: Meanwhile, a WikiLeaks team was scrambling to prepare the D.N.C. material. (A WikiLeaks staffer told me that they worked so fast that they lost track of some of the e-mails, which they quietly released later in the year.)   Based on our analysis, those additional emails were uploaded to the WikiLeaks site on August 26 and September 21; yet, a few months would pass before an announcement was made on November 6, 2016 that a second (and final) release was available.  This second batch of files was roughly the same size as the first batch (20,000 emails).

The New Yorker article suggests that all of the emails were available when WikiLeaks made its first release on July 22, 2016.  The indictment does not mention any other Zip files than the “1Gb or so” archive.  Looking at the “all.zip” entry in the table above, we see that the Zip file is 2.19 GB in size.  That is definitely not “1Gb or so”.

Our conclusion is that the “1Gb or so” archive file referred to in the indictment must not be the archive file that WikiLeaks drew from for the DNC emails, because it is just too small to hold the entire collection.

Anomaly: Problems Resolving Banfill’s Email Address on May 2, 2016

There is a group of eight (8) emails with Sent dates of May 2, 2016 that indicate there were problems resolving Banfill’s email address.

The problem emails are detailed below.  As shown, they appear in the emails collected for Miranda with sent times in the range 11:42:56 through 14:43:53.  We provide this list for other researchers in the event that they may decide to follow up.

Per various sources on the Internet, the “IMCEAEX” styled addresses are used when a user’s mailbox is improperly migrated, but this can also be an artifact of improperly moving exported emails from an original Exchange server to another, when (for example) they are copied to another server for production of emails in an e-discovery context. Here are some references: (1) IMCEAEX non-delivery report when you send email messages to an internal user in Office 365, (2) Strange Exchange E-mail Addresses in e-Discovery, and (3) IMCEAEX NDRs and how to fix them…

Was the 30 Day Email Retention Policy Implemented in Early May 2016?

As we noted earlier, Banfill was reminded of the 30 day retention period on May 17, 2016. Further, we pointed out a May 3, 2016 email, which announced unplanned system downtime on May 5. We do not know if those observations are related. Taken together, along with a possible failed delivery on May 2, we wonder if the 30 day retention period was implemented early in May 2016?

Guccifer 2’s DNC Email Attachments

Many of Guccifer 2’s DNC Email Documents Pre-Date the GRU Indictment timeline

Guccifer 2 posted seventeen (17) documents on June 30, 2016 and July 6, 2016 that can be found as attachments to DNC emails published by WikiLeaks on July 22, 2016 and November 6, 2016.

Previously, we noted that the WikiLeaks DNC email collection appeared to have been ex-filtrated on two dates: May 23 and May 25, 2016 (technically, the May 23 collection was initiated in the late evening of the previous day).  Below, we list the DNC emails that have attachments which match the documents that Guccifer 2 published.  They are sorted first by ex-filtration date and then by name.  The “G2 Tweak” column has an “x” for documents that Guccifer 2 modified, often making trivial changes like saving the document using a quirky user name. Guccifer 2 described those tweaks as his “watermark”.

We caution the reader that the May 23 and May 25 ex-filtration dates above apply to the WikiLeaks collection of DNC emails.  We cannot say whether Guccifer 2 got his documents from the same source, or not.  In fact, it could be sheer coincidence that the DNC emails shown above contain attachments with documents that Guccifer 2 published before the first release of DNC emails (July 22, 2016).

Above, we can see that almost one half the emails were ex-filtrated on May 23.  This is relevant, because the GRU indictment (dated July, 2018) said:

29. Between on or about May 25, 2016 and June 1, 2016, the Conspirators hacked the DNC Microsoft Exchange Server and stole thousands of emails from the work accounts of DNC employees.

Over two-thirds of the DNC emails in the WikiLeaks collection were ex-filtrated on May 23.  This is at odds with the indictment.  We see above that Guccifer 2 published several documents that appear as attachments to emails that we conclude were acquired on May 23.

Below, we sort the emails by Sent date.

The first line (in light orange) shows an email that has a Sent date of April 20, 2016.  As we described above, the DNC apparently implemented a 30 day retention period – emails older than 30 days were deleted.  Generally, all email ex-filtrated on May 23 complied with this retention rule, except for Brinster’s emails.  Brinster’s earliest Sent date was April 18, 2016.  If we add 30 days to that, we have May 18, which is 5 days earlier than the May 23 acquisition date.  Given this observation, we cannot rule out the possibility that there might have also been an earlier acquisition date than May 23 for the Brinster emails (in addition to a May 23 acquisition).

The line in light green is the last dated email found in the WikiLeaks collection that was acquired on May 25 (that matches a document posted by Guccifer 2).  Notice that the Sent Date is May 22, which precedes the May 25 and May 23 acquisition dates that apply to the WikiLeaks collection.  As we have said before, we have no evidence that Guccifer 2 derived his DNC documents from the email attachments found in WikiLeaks.  All we can say is that if Guccifer 2 derived his documents from a different source, then Guccifer 2 must have acquired those emails no earlier than May 23.

This table also shows that two of the documents published by Guccifer 2 can be found as attachments to emails published by WikiLeaks in its second email release (on Nov 6).  In our discussion of the “1Gb or so” archive file that the indictment suggests had been transferred from Guccifer 2 to WikiLeaks we observed that more than twice that amount would be needed to hold both DNC email releases published by WikiLeaks.  The indictment does not mention any other archive or transfer of documents from Guccifer 2 to WikiLeaks. Therefore, we doubt that the “1Gb or so” archive was a precursor to the WikiLeaks DNC email publication.

The Mueller Report Demurs on the DNC Email Hack

The Mueller report speculates that there is a link between the DNC emails which it claims were stolen by the indicted GRU agents and the emails that WikiLeaks published.  Yet, they offer no certainty (quite the opposite).  The relevant text, shown below, begins at the bottom of page 40 (emphasis added).

Between approximately May 25, 2016 and June 1, 2016, GRU officers accessed the DNC’s mail server from a GRU-controlled computer leased inside the United States.  During these connections, [the GRU] officers appear to have stolen thousands of emails and attachments, which were later released by WikiLeaks in July 2016.

Apparently, the Special Counsel’s investigators have no proof that the indicted GRU officers actually stole the DNC emails, or that those same emails were the source of the DNC emails published by WikiLeaks.  Further, no mention is made of the second (Nov 6) WikiLeaks release of DNC emails (which was roughly equal in number to the first).  When we add our observation that over two-thirds of the DNC emails were acquired on May 23 (not May 25 through June 1 as stated above) we conclude that the Special Counsel’s allegations lack merit.

Disclaimer

Indictments seldom disclose all the relevant facts and often withhold information ahead of trial.  Therefore, our analysis is constrained to respond to only the claims made in the GRU indictment and the Mueller report.

Closing Thoughts