The Need for Speed

The Need for Speed

Some reviewers have questioned the following conclusion in the Guccifer 2.0 NGP/VAN Metadata Analysis study.

Conclusion 7. A transfer rate of 23 MB/s is estimated for this initial file collection operation.  This transfer rate can be achieved when files are copied over a LAN, but this rate is too fast to support the hypothesis that the DNC data was initially copied over the Internet (esp. to Romania).

Below, performance data is tabulated that demonstrate that transfer rates of 23 MB/s (Mega Bytes per second) are not just highly unlikely, but effectively impossible to accomplish when communicating over the Internet at any significant distance.  Further, local copy speeds are measured, demonstrating that 23 MB/s is a typical transfer rate when writing a USB-2 flash device (thumb drive).

Below, are some representative discussions on the subject of the 23 MB/s rate cited in the study.

Guc2-reddit-23MBs-1

Guc2-reddit-23MBs-2

Guc2-reddit-23MBs-3

As we can see above, there was some confusion regarding the MB/s notation used in the analysis.  The analysis uses MB/s as a short form of “Mega Bytes per second” as detailed in MB: Mega Bytes or Mega Bits?  There is also some confused thinking that very fast local Internet transfer speeds in Romania will somehow make up for the very slow rates seen when traveling across Europe and then going trans Atlantic to Washington, DC.   To further complicate matters, various independent experts have asserted that Guccifer 2 used a Russian-based VPN service (through an end point in France) to communicate with various people.

In practice, actual transmission rates will fall well below the theoretical rates, because packets transmitted over the Internet have to transit many switches and must share bandwidth  with other users.  Further, copying multiple small files will increase the need for “hand-shaking” messages which further decreases the effective transmission speed.  The only way to find the actual speeds that can be achieved is to run tests.  The typical ISP provided “speed test” will show optimistic speeds, but they’re a start.  The following graphic shows the result of a cable provider’s speed test.

cable-speed-test-20-miles

In that test, we accessed one of the provider’s hosts that is about 20 miles away (as the crow flies).  The 113.4 Mbits/s rate corresponds to a 14.2 MB/s rate – well below 23 Mb/s.

Here is another test, accessing a host that is on the opposite coast (3100 miles away).

cable-speed-test-3100-miles

We can see that increases in the distance traveled can have a major impact on the transmission speed.  In this test, accessing a host on the opposite coast cut the download speed by a factor of 7.

ThreatConnect, a security firm, determined that Guccifer 2 used a commercial VPN service to mask his IP address.  ThreatConnect’s analysis is described in a blog entry.  Their key finding is summarized below (emphasis added).

Now, after further investigation, we can confirm that Guccifer 2.0 is using the Russia-based Elite VPN service to communicate and leak documents directly with the media. We reached this conclusion by analyzing the infrastructure associated with an email exchange with Guccifer 2.0 shared with ThreatConnect by Vocativ’s Senior Privacy and Security reporter Kevin Collier. This discovery strengthens our ongoing assessment that Guccifer 2.0 is a Russian propaganda effort and not an independent actor.

In March 2017, Adam Carter followed up on ThreatConnect’s research [see  http://g-2.space/ in the section titled “UPDATE (12 March)”].   Adam disputes their claim that the VPN IP address used was somehow “dedicated” for use by Guccifer 2 and perhaps other hackers with connections to Russia.  Adam writes:
So… it turns out that if ThreatConnect had tried using the default options – they would have been allocated the “exclusive” IP address that was NEVER really exclusive.
They’ve caused concern and distress unduly for a VPN Service provider by misrepresenting the service and produced false-positive indicators by suggesting the IP address was used by a shady group of Russians/Guccifer2.0 with exclusivity.
The discussion above is provided as background, simply to establish that any experiment that intends to replicate Guccifer 2’s use of the Internet should use a VPN service and measure speeds over that VPN connection.
If we enable a VPN service targeting a nearby server and retry the speed test, we see the following.
cable-speed-test-vpn-400-miles
The download speed over the VPN is roughly 60% of the speed of a direct connection.  There are probably a few reasons for this drop in speed: (1) the test no longer goes only through the provider’s network, (2) transiting the VPN server introduces another hop, (3) the VPN provider may implement bandwidth throttling, and (4) there may be additional overhead introduced by the VPN client, which is implemented in software.
The test results shown above are summarized in this table.
Guc2-Internet-speedtest
Even without introducing an intermediate VPN server, or going trans Atlantic, we can see that a transfer rate of 2 MB/s which is achieved when going cross country (US) is a factor of 10 slower than the 23 MB/s calculated in the report.
To measure and compare local transfer speeds to speeds achieved when copying from a close host on the Internet, we ran some tests copying the NGP VAN files on a file-by-file basis and then as a single big file (that was built by concatenating all the single files together).
Guc2-source-v-target-xfers
A few observations:
  • Local file-by-file copies over the (1 Gbit/s) LAN to a USB-2 thumb drive, saw a transfer rate of 24.3 MB/s, which is quite close to the 22.6 MB/s rate calculated in the study.
  • The file-by-file copy time for transferring over the LAN to the PC’s hard drive (SSD) is 47.8 MB/s – about 2x the speed seen when copying to a USB-2 thumb drive.
  • File-by-file copy speeds from a well-provisioned server (with a 1 Gbit/s symmetric Internet service) about 25 miles away clocked in at about 8.5 MB/s.  That rate was dominated by communication overhead and did not vary much when copying to different media target types.

Large single file copies were significantly faster than file-by-file copies as shown in the table below.

Guc2-file-by-file-vs-big-file

 

A few observations:

  • The effect of copying file-by-file versus copying an equivalent single large file is especially noticeable when copying over the Internet where single large file copies were consistently (1.7x) faster than copying file-by-file.
  • Even though single file copies were significantly faster than file-by-file copies, they showed a similar variation in speed for matched source and target media types

Finally, a few Internet only tests were run which copied a 100 MB file first from the Internet server used in the previous tests and then ran the same test using a VPN server at various geographic locations.

The following primitive diagram shows the Internet connectivity when a VPN server is interposed.

Guc2-vpn-server-diag

As we can see the VPN server passes through the communications between the PC and the Server.  We can select a VPN server at various distances from the PC to simulate communication to a server at various geographic locations.

The test results are shown in the table below.

Guc2-Internet-xfers

 

When copying a single large file, we were able to achieve a transfer speed just a little faster than the vendor’s speed test which indicates that the test server’s Internet speed is sufficient to max out the local cable connection.  As we saw before, the speed drops noticeably when we enable the VPN service, even when the VPN server is close to the PC and the test server.  The transfer speeds drop into the range of 1 MB/s to 2 MB/s when communicating through Romanian, Ukrainian, or Russian VPN servers.

In conclusion the performance data above strongly supports the statement in the study:

A transfer rate of 23 MB/s is estimated for this initial file collection operation.  This transfer rate can be achieved when files are copied over a LAN, but this rate is too fast to support the hypothesis that the DNC data was initially copied over the Internet (esp. to Romania). 

36 thoughts on “The Need for Speed

  1. Comments are closed. They have been open for over a month; hopefully this has given ample opportunity for readers to comment. Responding to comments is worthwhile, but time-consuming; The Forensicator needs to turn his attention to other projects. Thank you everyone who has taken the time to comment.
    — The Forensicator

    Like

  2. New blog post: Summarizes the Internet speed issue, adds new transfer speed calculations that raise the bar for transfer speed over the Internet, discusses alternative theories, and corrects the record.

    If you find yourself in a hole, stop digging

    […] The Forensicator made a mistake, maybe a couple. In this blog post he will describe those mistakes and how he plans to fix them.

    The main mistake he made is that he got sucked into defending a technical claim made as a side remark, which had little impact on the Guccifer 2.0 NGP/VAN Metadata Analysis […]

    Like

  3. The VIPS memo states: “After examining metadata from the “Guccifer 2.0” July 5, 2016 intrusion into the DNC server” and also that the July 5 copy was on “a computer directly connected to the DNC server or DNC Local Area Network”.

    As far as I can tell, neither you nor Adam Carter claim this and you are both open to the possibility that the July 5 copy was made from source files that weren’t on the DNC server at that time. Is that right, and if so, can you tell VIPS?

    Like

    1. FYI, the full VIPS memo is here.

      Although The Forensicator fully supports the VIPS’s request for a thorough investigation of Russian hacking claims (with more evidence being made available to the public), the VIPS may have gotten a bit in front of the ball with their claims. As far as contacting them goes, their article has been out for a while and has received a lot of attention. Forensicator’s guess is that they have gotten your message, as well as others.

      Another claim in the VIPS memo that has received a lot of heat is the claim that the observed 23 MB/s transfer speed is “too fast for the Internet” (Forensicator’s paraphrase, not actually stated in those words). Forensicator has always viewed that as a minor point in the analysis, and it would not have been his choice as the main claim used in any article/report derived from his work. He can see why the VIPS chose it – it is simple to state and understand. Unfortunately, even though the claim hasn’t been shot down yet with concrete test data, it is a difficult point to defend.

      Adam Carter has published an article which addresses various issues in recent media reports on both the Forensicator’s analysis and his own research on Guccifer 2.
      See Distortions & Missing The Point

      The Forensicator is working on a blog post that will be published here soon that will address various subjects including the VIPS (and media) focus on transfer speeds over the Internet as well as some follow up to feedback that has come in over the past month.

      As a reminder, the main point of The Forensicator’s metadata analysis was to challenge the Guccifer 2 narrative (remote hacker in Romania). The findings however can be interpreted in several ways with varying degrees of certainty — various journalists/security experts/organizations have done that.

      Like

      1. Thank you. I look forward to the new post. I’ve read Adam Carter’s latest.

        VIPS was more than just a bit ahead of the ball on direct access to the DNC server. They said you claimed something you didn’t on a very important point.

        I understand the cat is out of the bag now, both Patrick Lawrence and Leonid B. repeated the claim, so I agree contacting them isn’t going to put it back. But perhaps you could write something new and/or update the analysis page to clear up the misconception.

        Like

  4. “A hacker might have downloaded it to one computer, then shared it by USB to an air gapped [off the internet] network for translation, then copied by a different person for analysis, then brought a new USB to an entirely different air gapped computer to determine a strategy all before it was packaged for Guccifer 2.0 to leak,” said Barger.

    Every time the files were copied, depending on the method they were transmitted, there would be a new chance for the metadata to be changed.

    Hultquist said the date that Forensicator believes that the files were downloaded, based on the metadata, is almost definitely not the date the files were removed from the DNC.

    That date, July 5, 2016, was far later than the April dates when the DNC hackers registered “electionleaks.com” and “DCLeaks.com.” Hulquist noted that the DNC hackers likely had stolen files by the time they began determining their strategy to post them.

    http://thehill.com/policy/cybersecurity/346468-why-the-latest-theory-about-the-dnc-not-being-a-hack-is-probably-wrong

    Like

    1. The Hill article that you cite has various errors and misunderstandings as they relate to Guccifer 2.0 NGP/VAN Metadata Analysis.

      Adam Carter has published an article which addresses various issues with recent media reports on both the Forensicator’s analysis and his own research on Guccifer 2.
      See Distortions & Missing The Point

      On this point, Mr. Hulquist and The Hill are just wrong:

      Hultquist said the date that Forensicator believes that the files were downloaded, based on the metadata, is almost definitely not the date the files were removed from the DNC.

      If Mr. Hulquist were to review the Forensicator’s metadata analysis in detail along with the many replies to comments and additional blog posts, he will find no mention along the lines of his statement above. Mr. Hulquist will also note that the Forensicator does not use the term download, because as his analysis describes in detail, he sees indications of copying the data on two dates: July 5, 2016 and Sept. 1, 2016. Further, both copying events have indications that the data was copied locally (and that Eastern time zone settings were in effect).

      In spite of making up a statement/belief that The Forensicator allegedly expressed, The Hill did not demonstrate customary journalistic practice and link to either The Forensicator’s analysis, or the work of Adam Carter. If they had done that, their readers could have more easily researched the topic on their own.

      Like

  5. Another argument for that speed being impossible is that we are talking about a server here. The entire point of a server is to, well, serve, multiple people. It’s pretty safe to assume that there were more people connected to DNC servers, than just Guccifer 2.0. That means that the server would need to upload files at the rate of 22MB/s, while at the same time handling other connections/download from actual DNC staff members. Which means the actual bandwidth of the server, would have to be way above that.

    Like

    1. Do you have any clarity into the server platform and network topography at the DNC? Guccifer 2.0 reputedly claimed that he hacked the DNC through their NGP/VAN account. As best as I can determine, that is a cloud service hosted by AWS that provides a suite of applications for managing GOTV, fundraising, a public website and other functions of politicking. Issues like server load would certainly be relevant to that server. It holds the Democrats’ national voter database, information that would be catnip to a troublemaker bent on disruption. Does AWS or NGP/VAN acknowledge an intrusion during the Spring/Summer of 2016?

      The directory listing provided in your analysis reflects the organization of a typical LAN file server, with specific reports, notes, etc., some named in capricious ways that don’t suggest the discipline of a formal enterprise software application used by many people. It has been stated that some of the contents were Word documents and that some were in RTF format. Would a contemporary internet-hosted software suite regarded as the party’s primary competitive IT asset need to acknowledge legacy file encoding?

      The DNC server therefore seems to be utilized as a common workgroup file repository that is accessible to users with LAN connections inside the DNC’s offices, a simple “shared disk” where the security protocol was to password-protect individual files, rather than define server-enforced security policy. If this server were set up with public-facing services like HTTP or FTP, that should be evident.

      My questions are speculative, but the silence regarding whatever was actually hacked/leaked suggests how little we really know without a hands-on investigation of the network and devices involved.

      Like

      1. My questions are speculative, but the silence regarding whatever was actually hacked/leaked suggests how little we really know without a hands-on investigation of the network and devices involved.

        This, I think is key and is my main goal in publishing the Guccifer 2.0 NGP/VAN Metadata Analysis. The VIPS have also posted a report which they use as a basis to request that a formal, in depth investigation be done where more convincing evidence is shared with the public.

        Partly because it can be easy to do (and it can generate some clicks) various reviewers and journos have picked over both the metadata analysis and follow on media articles. Some criticism is constructive; however, they (IMO) may have lost sight of the bigger picture, which is: Does the NGP VAN metadata analysis, and Adam Carter’s work at g-2.space provide some incentive to update the analysis published in the USIC report and augment it with some hard data?

        Like

    1. Thanks. Can you post a speed test for a local server and mention the nature of your connection (optical fiber, presumably) and how long you have had it? Can you also traceroute that DC speed test server and/or give its IP address? Do you have any anecdotal data on transferring large files from a confirmed US site? By “confirmed”, I mean a site that you can reliably determine isn’t cached locally?

      FYI, a journalist (for The Hill, IIRC) contacted the DNC as part of his reporting and they declined to make any statement regarding the speed of the DNC’s Internet connection speed. That is also factor in determining max transfer speed for the case being considered.

      In your speed test the max upload speed is 55 Mbits/sec. If the DNC server were similarly constrained – 6.9 MB/s would be the best case.

      Like

      1. “In your speed test the max upload speed is 55 Mbits/sec. If the DNC server were similarly constrained – 6.9 MB/s would be the best case.”

        The download speed is what is important in this case, since it is the server in DC that is sending him the file, just as it was the server in DC sending the DNC files. David’s download speed is limited by the upload speed of the DC server, which is almost 3x the “impossible” speed of 23 MB/s.

        Like

        1. The download speed is what is important in this case, since it is the server in DC that is sending him the file,

          For David to download data, the DC server has to upload it. To flip things around, if the DC server had David’s service, its max upload speed would be 7 MB/s and that would be the rate that David sees, no matter how fast he can download the data.

          A technical note on Internet speed tests: they download multiple streams in parallel. Basically, the goal is to fill the pipe. Details here:
          How does the test itself work? How is the result calculated?
          The metadata from the NGP VAN files that were analyzed showed no signs of a multi-threaded download – therefore it is reasonable to expect actual file transfer speeds to be lower than the Speedtest results. “The Need for Speed” article used Speedtest results in places to (1) establish a best case baseline, (2) show the potential impact of communicating over a distance and the impact of using VPN service. Other than that actual tests copying the actual files were performed.

          The point about upload speed was mentioned in the paragraph before the one you cited.

          FYI, a journalist (for The Hill, IIRC) contacted the DNC as part of his reporting and they declined to make any statement regarding the speed of the DNC’s Internet connection. That is also a factor in determining max transfer speed for the case being considered.

          If we are to make a hypothetical case for copy speeds from a DNC server, we need to know how the DNC’s Internet service was configured. The DNC has declined a request to provide that info.

          Although the media has focused on the claim in Guccifer 2.0 NGP/VAN Metadata Analysis about Internet speeds, that statement is not critical to the overall analysis. I plan to write up a blog post over the weekend to address some issues in that regard.

          In the meantime, refer to this article authored by Adam Carter which addresses various misconceptions and outright errors in recent media coverage of the Forensicator’s analysis.
          Distortions & Missing The Point

          Like

    2. Yes.. this report is crazily uninformed or maliciously devious. Been uploading and downloading at much higher rates than the impossible ones reported here since many years between Europe and Us. 5 years ago was able to transfer at around 35 MB/s from European university to US university. At work in last 3 years we have commercial grade (professional) connection capable of much more. Distance argument is hilarious.

      Like

      1. Although anecdotal reports are good, I would like to update the The Need for Speed study with hard data. Do you have the capability of copying say, a 100 MB file (ideally with random data in it to avoid compression) from say somewhere in the US to your company’s location (presumably in Europe)? Or do you still have contacts in the universities that you mention who could run that test?

        Like

  6. theforensicator, I’m trying to understand what precludes the possibility that the files were transferred via SFTP from the original host computer to a remote server across the world. If the SFTP client were configured to preserve timestamps, then almost everything you concluded in your analysis could’ve been accomplished at the remote destination server at some later time. In other words, SFTP the entire batch of files over the internet while preserving timestamps, then at some later time on the destination computer, filter the files and copy them to a USB flash drive, etc. Why is this not possible or likely? What am I missing here?

    Liked by 1 person

    1. For your particular scenario, you are leaving out the second copy operation described in the metadata analysis, for which East Coast time settings were also in force and the precursor components of the 7zip file were copied to a FAT formatted media. So even if you have a way of explaining how the data made it over to Eastern Europe, you will need a plausible explanation on how/why the data was copied back to the East Coast and then placed on a thumb drive? (The study doesn’t say it was a thumb drive; they simply are the most common FAT formatted media, so the use of a thumb drive seems likely).

      For your particular scenario, if the hypothetical hacker lands onto a Unix system with access to the ex-filtrated data, he might be able to copy the data to a local hard drive, from over the LAN, using ‘cp’. However, the calculated transfer speed of 23 MB/s mostly rules that out, because as shown in The Need for Speed, local copies to a hard drive are likely much faster than that. In the test results documented in that study, local copies to a hard drive were roughly 2x faster. That is not to say you can’t find an old, slow server with 5400 RPM drives or some similar configuration that might be slow enough to demonstrate those speeds; you just have to decide how likely is that scenario?

      Another circumstantial factor is that mid-20 MB/s transfer speeds are a common characteristic of USB-2 thumb drives, confirmed by the tests in the “Need for Speed” study (also noted by various people, based on their experience, who read the metadata analysis). Finally, hackers typically work hard to minimize their footprint and copying 20 GB of data to a server’s local hard drive leaves a big footprint, even if it is only temporary.

      Like

      1. There’s no need to copy anything back to the East coast in my scenario, and I think it still replicates all of the forensics perfectly. All of this could have taken place in Russia.

        People working on a Russian operation will surely be told not to unintentionally leave metadata pointing to them (they may do it intentionally some times to play mind games though). A very common practice is to simply use a Virtual Machine with altered location and time zone settings. Within the context of operations targeting the US, it makes perfect sense to have one with settings pointing to the East Coast. The person assembling the archive need not have had any specific intent of leaving behind metadata pointing to the US, just a perfectly natural desire of covering their own tracks. They may not even have been aware that the operations they’re performing would allow inferring the time zone, they’d just have a habit of using the VM just in case. To replicate the forensics, it’s enough to assume this VM was used only in September to create the archives with WinRAR. It could also just be a regular Windows machine by the way, it’s trivial to change the timezone settings, a VM just sounds more plausible to me.

        It’s quite obvious from Guccifer’s public pronouncements that he himself was not the person who hacked the DNC. So it’s natural to infer that multiple loosely coordinating Russian teams were involved here, the hacker team and the disinformation team at least, quite possibly working from different locations in Russia. They may have been called together for a meeting around July 6 to discuss what to do going forward, and at that point the easiest most obvious way of passing along the data to be disseminated is to hand them an external USB hard drive with the data. These external hard drives will often be formatted as NTFS so it would preserve the 0.1ms precision timestamps. Then in September, after archiving, the final product being smaller in size may have been copied to a regular USB key with FAT. This again indicates multiple teams communicating via external drives, and increases the plausibility of the idea that all of the copies took place in Russia.

        This all seems perfectly consistent with the idea that it was a hack. Where’s the flaw in this theory?

        Liked by 1 person

        1. People working on a Russian operation will surely be told not to unintentionally leave metadata pointing to them (they may do it intentionally some times to play mind games though). A very common practice is to simply use a Virtual Machine with altered location and time zone settings. Within the context of operations targeting the US, it makes perfect sense to have one with settings pointing to the East Coast.

          I think that your overall scenario fits the fact pattern in the metadata analysis, however it depends strongly on a rationale for why the hypothetical Russian hackers might set their PC’s time zone to US Eastern. If I were going to guess at a time zone that they might use, I’d probably choose one for Romania, since that is Guccifer 2’s story line. I’d also think that mis-directing to another country like China or NK would be an understandable choice.

          That said, I am thinking of compiling the best alternative theories into a separate blog post (the comments section is getting to be really long and is split across the main blog and the other blog posts). Your scenario would be near the top of my list.

          If your hypothetical Russian hackers had followed your advice, they wouldn’t have found themselves in this situation.

          How one typo helped let Russian hackers in

          Some of the evidence was surprisingly simple, such as timestamps showing when the hackers were working. “The mistake they’ve made is leaving these timestamps,” Hultquist said. “And if you look at enough of them over time, you get a picture of what actual hours this operator is working. And what they come down to is a work schedule that fits right in with western Russia’s time zone.” “Besides that,” Hultquist added, “there’s a lot of Russian language artifacts,” meaning computer code written in the Cyrillic, or Russian, alphabet.

          Like

  7. [Moderator’s note: 3 separate (somewhat duplicative) comments have been combined into one; the text is left otherwise unchanged.]

    If you actually knew half as much about tech as you claim to, you would know that all VPNs are not created equal. Taking screenshots of a speed test you got on your particular VPN (which I notice you did not name) tells us absolutely nothing. It is actually possible to get a speed INCREASE through a VPN, depending on the service you’re using and various other factors. Reference: https://www.pcmag.com/roundup/351574/the-fastest-vpns and https://www.howtogeek.com/253195/how-can-a-vpn-improve-download-speed/ (Yes, the PC Mag article is from 2017, but it still clearly belies your statement that a VPN will ALWAYS slow down your speed.) So many factors can go into someone’s connection speed that the idea of your Speedtest.net screenshots telling us anything whatsoever about Guccifer’s connection is downright laughable.

    Also, the original claim made did not hinge on whether or not Guccifer was connected to a VPN. Your claim was that download speeds of 23 MB/s had to indicate a local transfer, rather than a transfer over the Internet. This is patently false. I had an Internet connection in mid-2016 that regularly downloaded at faster speeds than that, and I’m a consumer in America; other countries have faster connections, as do business lines. Notably, you only brought up the fact that Guccifer connected through a VPN when you were called out on being completely wrong regarding the possibility of such high transfer speeds over the Internet.

    Another note: While 23 MB/s may be a typical transfer rate for USB 2.0, the MAX transfer rate is closer to 40-50 MB/s. USB transfer rates will also vary significantly depending on many factors, such as what USB drive you’re using; like VPNs, not all USB drives are created equal. If both USB and an ISP connection were capable of both meeting and exceeding the 23 MB/s transfer rate in mid-2016, which they were, then the transfer rate actually tells us absolutely nothing.

    This website isn’t even hosted on your own domain name, for crying out loud. This is a WORDPRESS SITE. Literally anyone could have made this website in fifteen minutes. You know nothing about tech and are simply a con man trying to fool people, and you should be ashamed.
    ————-
    lol, are you honestly referring to you taking a bunch of screenshots of your Speedtest.net results as a “study?” That’s hilarious.
    ————-
    Anyone who understands this type of technology at even a casual consumer level would understand he has no idea what he’s talking about, since VPN speeds vary HUGELY depending on what service you use and various other factors, and can actually INCREASE your connection speed.

    Like

    1. This website isn’t even hosted on your own domain name, for crying out loud.

      I have chosen to remain anonymous. Hosting my own domain would compromise that goal.

      This is a WORDPRESS SITE. Literally anyone could have made this website in fifteen minutes.

      I decided to post the analysis on a WordPress site so that there is a forum for reviewers like yourself where they can provide feedback. Although the Guccifer 2.0 NGP/VAN Metadata Analysis report took quite a bit longer than 15 minutes to research and write, I agree that the web site isn’t much to look at.

      It is my hope that the analysis is described in sufficient detail that interested parties can replicate the research and confirm the findings if that is something they’re interested in doing. If not, hopefully the description is detailed enough that reviewers like yourself can understand the process used to arrive at the conclusions and findings documented in the report.

      Also, the original claim made did not hinge on whether or not Guccifer was connected to a VPN.

      I agree. In fact, there are no references to VPN’s in the main article. VPN nodes were used in the The Need for Speed article to illustrate the effect that increasing geographic distances have on the realized transfer speeds. If I had access to well-connected servers in DC, London, Germany, Romania, and Russia I would have used them to conduct tests. The VPN nodes were the next best thing. The actual measured transfer speeds to those VPN nodes is not important in that study except to illustrate that as packets have to travel further distances over a public Internet network – transfer times will slow down.

      Notably, you only brought up the fact that Guccifer connected through a VPN when you were called out on being completely wrong regarding the possibility of such high transfer speeds over the Internet.

      Again, (1) the original report doesn’t mention VPN’s, (2) the “Need for Speed” article uses VPN nodes to show the effect that increasing distances had on transfer speeds. I did mention in a reply to a comment that some researchers had noted that Guccifer used a VPN in his communications and I said that if he used a VPN to transfer hacked data, it would likely slow down the transfer. I also pointed out that most hackers will use either a VPN or compromised computer as an end point to mask their IP address. They could also use a hacked router, but that won’t give them the storage resources they need to collect the hacked data.

      “The Need for Speed” study was an attempt to provide hard data, rather than just hand waving. Tests were run to determine more than just Internet speeds.

      Regarding the cited 23 MB/s transfer rate, originally the purpose of the metadata analysis was to disprove the claim that Guccifer 2 is a remote Romanian/Russian hacker (at least with respect to the NGP VAN 7zip file). The 23 MB/s rate was fast enough to support the “local copy” claim and it seemed like a slam dunk that 23 MB/s was too fast to support the claim that a multi-gigabyte transfer back to Eastern Europe over the Internet could be accomplished at that speed.

      Then, as reviewers noticed that the metadata analysis article asserted that East Coast time zones were in force, those looking for contrarian scenarios came up with the idea of a hypothetical host located on the East Coast, which was connected to the DNC and was used as a collection point. Not just any hypothetical host but one with a 300+ Mbit/s Internet connection and that was close enough to the DNC to sustain a 23 MB/s transfer rate.

      I have decided to concede that point, and agree that if you introduce this hypothetical local host with a very high speed Internet connection that is geographically close to the DNC you may have a scenario that meets the 23 MB/s transfer speed criteria. When I have some time, I plan on writing a short blog post on the topic, describing a few additional considerations that need to be addressed to make that scenario work.

      As a reviewer of the original analysis report you will be left with a decision as to how you weigh the likelihood of this local intermediate host scenario versus the “local copy” conclusion reached in the metadata analysis.

      As you might expect, I think that the local intermediate host scenario is much less likely than the local copy interpretation of the metadata.

      The local copy conclusion was influenced by observations made regarding the second copy operation that the file timestamps indicate occurred on Sept. 1, 2016. There again are indications that Eastern time zone settings were in force and the data was written to FAT formatted media. The FAT formatted media suggests the use of a USB thumb drive, because USB thumb drives are one of the most common FAT-formatted storage devices. The use of a thumb drive would support the conclusion that the second copy operation was also a local copy operation someone present to operate the thumb drive.

      Like

  8. I don’t understand some of the underlying assumptions. Why is it assumed the files were pulled over the same VPN used to communicate? Why is it assumed the beginning connection speed (Romanian) isn’t high enough to still be at least 180 Mbps even after speed loss? Why is it assumed there was no stateside machine used as a go-between? Why is it assumed G2.0 was telling the truth about where he originated the hack from? If a VPN were used, why is it assumed to be commercial?

    Like

    1. Why is it assumed the files were pulled over the same VPN used to communicate?

      The “Need for Speed” study simply reports performance data for various scenarios. Thus, if one were to assume that Guccifer 2 used a VPN service, they can refer to the study to see how the use of a commercial VPN service might affect performance.

      Most hackers will use some method to mask their IP address. This often involves proxying through a VPN or a compromised system. In some cases, the hacker will string several computers together to further mask his IP address. Any method used to mask the IP address will slow down communication speeds.

      Why is it assumed the beginning connection speed (Romanian) isn’t high enough to still be at least 180 Mbps even after speed loss?

      Please refer to the “Need for Speed” study. It shows that in spite of fast apparent speed that the actual speed will drop off as the distance to the destination host increases. Add a VPN to that and the speed will be further reduced.

      Why is it assumed there was no stateside machine used as a go-between?

      Initially, the study intended to refute Guccifer 2’s claims that he obtained the “NGP VAN” data via a remote hack from somewhere in Eastern Europe. In the course of the analysis several observations were made, including the observation that East Coast settings were in force for both the first and second copy operations noted in the study.

      For reviewers who strongly want to hold onto the conclusion that the data was derived from a hack, they have had to invent a local host which is both close enough to the DNC and with high enough bandwidth to support the possibility of a 23 MB/s transfer rate.

      It is probably safe to say that without those other factors discussed in the analysis, there would be few (or no) people asking things like: “why didn’t your study suggest a statewide system used as a go between”?

      Why is it assumed G2.0 was telling the truth about where he originated the hack from?

      One of the main points of the study is to test whether G2.0 was telling the truth. To do that, the study starts with the assumption that he was and then works its way from there.

      If a VPN were used, why is it assumed to be commercial?

      Some security researchers have noted that Guccifer used a VPN to mask his communications. It is not a big leap to assume that he might use one to mask his IP address when attempting to hack the DNC.

      The “Need for Speed” analysis simply presents performance data using a VPN service to show both the impact of that service and the impact of communication distances on transfer speeds. This author does not have access to some hypothetical custom VPN service, nor an array of well connected servers spread over wide geographic distances, therefore a VPN was used to demonstrate the impact of increasing communication distances.

      Like

      1. [Moderator’s note: edited for legibility. Text otherwise unchanged.]

        Why is it assumed the beginning connection speed (Romanian) isn’t high enough to still be at least 180 Mbps even after speed loss?

        Please refer to the “Need for Speed” study.

        Your study doesn’t cover typical bandwidth availability between well connected data centers on opposite sides of the Atlantic, so folks are still wondering why you seem to think …

        For reviewers who strongly want to hold onto the conclusion that the data was derived from a hack, they have had to invent a local host which is both close enough to the DNC and with high enough bandwidth to support the possibility of a 23 MB/s transfer rate

        Pointing out that residential 200mbit, 250mbit, 500mbit, and 1000mbit internet service is available right now and has been available to enterprise customers for a relatively long time does not amount to “strongly wanting to hold onto […] a hack.”. It’s strictly a matter of trying to communicate to you that 200mbit/s~ file transfers are actually routine these days. All anyone can be saying by pointing this out is that you’re using language that is far too strong on the point of transfer speed, and that given the availability of high speed internet it’s just not an accurate statement to say that 23MB/s is categorically impossible to achieve in any other situation but over a LAN or with a USB device.

        Just because someone has doubts about your conclusions in the area of transfer speed doesn’t mean they have an opinion on whether or not it was a hack or an insider leak. This is just a technical detail, it’s becoming a wider talking point, and its going to taint the rest of the analysis when it’s pointed out that 23MB/s is not a difficult transfer speed to achieve in multiple plausible scenarios.

        It’s actually my opinion that an insider leak is a slightly more likely explanation than a hack, but I mainly say that because insiders are generally the biggest threat to any large organization. I digress.

        The “Need for Speed” analysis simply presents performance data using a VPN service to show both the impact of that service and the impact of communication distances on transfer speeds.

        Why does your performance data seem to only be concerned with a residential connection?

        This author does not have access to some hypothetical custom VPN service, nor an array of well connected servers spread over wide geographic distances

        ah.

        Have you considered leasing time on a pair of servers at distant well-connected data centers and testing your hypothesis that file transfer speeds of 23MB/s are impossible over the internet?

        Like

        1. Why is it assumed the beginning connection speed (Romanian) isn’t high enough to still be at least 180 Mbps even after speed loss?

          Your study doesn’t cover typical bandwidth availability between well connected data centers on opposite sides of the Atlantic, so folks are still wondering why you seem to think …

          I have seen it asserted by some that two “well-connected hosts” might demonstrate a much lower drop off in speed than I saw in my test results. I used VPN nodes as end points to simulate the effect of communication with a well-connected host, mainly hoping that a commercial VPN vendor would locate those nodes into well-connected, well-provisioned facilities.

          Why does your performance data seem to only be concerned with a residential connection?

          Part of my thinking that Guccifer 2 would use normally available consumer Internet technology was influenced by his dialog and by reports by some security researches that he used a VPN for things like email. That certainly might be a faulty assumption.

          In any event, if you or others have the capability to perform more extensive tests, please send along the test results so that they can be posted here. I will escalate that sort of report to a blog post if the results warrant.

          FYI, my test set up for Internet testing included one well-connected server with a 1 GB/s symmetric Internet service, a residential connection, and a commercial VPN service.

          […] All anyone can be saying by pointing this out is that you’re using language that is far too strong on the point of transfer speed, and that given the availability of high speed internet it’s just not an accurate statement to say that 23MB/s is categorically impossible to achieve in any other situation but over a LAN or with a USB device.

          I agree, and am re-posting below something that I wrote up on another comment thread. Also, my apologies for throwing you (and others) under the bus of “strongly wanting to hold onto […] a hack.” The purpose of this comment section is to provide a forum for feedback and counter-arguments and that comment wasn’t constructive.

          Regarding the cited 23 MB/s transfer rate, originally the purpose of the metadata analysis was to disprove the claim that Guccifer 2 is a remote Romanian/Russian hacker (at least with respect to the NGP VAN 7zip file). The 23 MB/s rate was fast enough to support the “local copy” claim and it seemed like a slam dunk that 23 MB/s was too fast to support the claim that a multi-gigabyte transfer back to Eastern Europe over the Internet could be accomplished at that speed.

          Then, as reviewers noticed that the metadata analysis article asserted that East Coast time zones were in force, those looking for contrarian scenarios came up with the idea of a hypothetical host located on the East Coast. This intermediate host could connect to the DNC (behind its firewall) and was used as a collection point. Not just any hypothetical host but one with a 300+ Mbit/s Internet connection and that was close enough to the DNC to sustain a 23 MB/s transfer rate.

          I have decided to concede that point, and agree that if you introduce this hypothetical local host with a very high speed Internet connection that is geographically close to the DNC you may have a scenario that meets the 23 MB/s transfer speed criteria. When I have some time, I plan on writing a short blog post on the topic, describing a few additional considerations that need to be addressed to make that scenario work.

          As a reviewer of the original analysis report you will be left with a decision as to how you weigh the likelihood of this local intermediate host scenario versus the “local copy” conclusion reached in the metadata analysis.

          As you might expect, I think that the local intermediate host scenario is much less likely than the local copy interpretation of the metadata.

          The local copy conclusion was partly influenced by observations made regarding the second copy operation that (per the file timestamps) indicate it occurred on Sept. 1, 2016. There are indications that Eastern time zone settings were in force and the data was written to FAT formatted media. The FAT formatted media suggests the use of a USB thumb drive, because USB thumb drives are one of the most common FAT-formatted storage devices. The use of a thumb drive would support the conclusion that the second copy operation was also a local copy operation with someone present to operate the thumb drive.

          Like

  9. I’m curious why Guccifer2.0 identified his file as consisting of NGP/VAN data when it appears to be .doc and other files stored in a series of conventional server directories. NGP/VAN is a cloud-based SAAS platform that layers various electioneering applications atop the Democratic Party’s national voter database. I’d like to know if any part of that stack would be hosted locally in the DNC’s offices. Guccifer told an interviewer that he exploited a zero-day exploit in that software to breach the DNC’s files, but this appears to have been discredited. NGP/VAN’s nameservers are hosted on AWS.

    I’d also like to understand how much data transfer over the internet was typical for the DNC. Surely their hosting provider or ISP can provide logs demonstrating 1.9GB leaving their server at the alleged time of the breach? Or, if that was not anomalous, show how it would have gone unnoticed?

    Like

    1. You aren’t the only person to have asked about the relevance of Guccifer 2’s NGP VAN 7zip file to the NGP VAN service. This author provides a good rundown of the file contents,
      Guccifer 2.0 – 13Sept2016 Leak – A Reader’s Guide. He concludes No, this looks more like a discarded hard drive that was harvested and falsely labeled as a “hack” of the DNC.

      Sounds ominous, but as a reminder, the metadata analysis concludes that 20 GB of data was likely copied, but only 2 GB were disclosed by Guccifer 2.0 in that 7zip file. A case could be made that this 2 GB excerpt served as a warning message sufficient to demonstrate that he had more and that it was sourced from the DNC. Since the DNC has neither confirmed nor denied Guccifer 2’s claims, the metadata analysis took the point of view that the data is derived from a DNC source, but may not have a lot to do with NGP VAN. Except, as you note, Guccifer 2 claimed initially that he obtained the data by hacking the NGP VAN front end using a zero day exploit.

      Liked by 1 person

  10. Forensicator – Your work is fantastic and I think anyone who understands this type of internet technology at a professional level also understands that you know what you write about. There simply is no question. However, I would like to see you approach the question a different way. (Full Disclosure: I haven’t read all of your writing so it’s possible this is addressed elsewhere.)

    Approaching the speed issue and working backwards to try to invalidate your results, I would suggest the following scenario:

    Russian hackers intent on staying undiscovered would not access the files directly or over a VPN, but rather compromise a system local (in a general sense) to where the attack end point exists. For example, by taking control of a machine (server or desktop) in a well connected office building they could achieve these speeds. It is not uncommon for a serious office to have OC-48 or even higher bandwidth. In the late 2000’s I worked in such a building. In the late 90’s at BBN I experienced this bandwidth as well. Add to this possibility that the attacker compromised a server in a colocation facility (think Equinix Dulles) that has massive amounts of bandwidth available. At those speeds you could clearly achieve the transfer rates discussed.

    Reading Conclusion 1 and Conclusion 2 in your analysis post, I think I can make my scenario fit your conclusions. The local system in Conclusion 1 (with Eastern time zone) was really a well connected system on the East Coast. Conclusion 2 also fits nicely with the colocated server scenario.

    If I were to truly try to attack your writing, this is probably where I would start. It would be great to hear from you regarding this specific scenario.

    Regarding Conclusion 3: This is very interesting. As a professional taking files, I would use the -a flag on cp to preserve the timestamps of the file. This would point to an amateur doing the copy. However, since there are directories they would have had to use the -R flag to recursively copy the directories. Why not use -a? Would the person copying know the -R flag and not -a?

    Again, I want to be clear on two points: 1) I haven’t combed over all of your writing and the answers may already be there. 2) I agree with your conclusions and simply wanted to point out that my scenario above may poke a hole in it.

    Great work.

    Like

    1. Agree with TR. The one missing piece was what was the backbone speed of the DNC router? All the VPN analysis goes down the tubes if that speed isn’t matched by the Forensicators test.

      Like

  11. Great work! I’m trying to poke holes in this. Do we have a file that we know is from the DNC server that we can match with the file metadata from Guccifer 2.0?

    Like

    1. Some researchers were successful in finding documents disclosed by Guccifer 2 in attachments to the DNC emails present on Wikileaks, IIRC. Those documents (found as attachments) were compared to the documents leaked by Guccifer 2. As detailed in g-2.space, the researchers were able to show that the documents released by Guccifer 2 had been doctored. A similar thing could be tried with the NGP VAN 7zip files, but I haven’t seen/heard of anyone trying to do that.

      Like

Comments are closed.