MB: Mega Bytes or Mega Bits?

UPDATE (2017-08-02): This blog entry has been updated with additional information which documents actual transfer rates seen when targeting both a close US host and another domestic US host located on the opposite coast.  The effect of using a VPN is also shown.

The bottom line is that the rate drops dramatically when packets have to transit large distances (even without factoring in the use of a VPN, or going trans Atlantic) – the transfer speeds dropped from 14 MB/s to 2MB/s.

Detailed test results are documented in the blog entry,  The Need for Speed.

Some reviewers have asked about the use of “MB/s” as a measure of transfer speed.  In the Guccifer 2.0 NGP/VAN Metadata Analysis report.  “MB/s” refers to Mega Bytes per second where “Mega” is one million (1,000,000).  Some reviewers have confused this notation with “Mb/s”, or mega bits per second often quoted by ISP’s.   Those two measures of transfer can be confused with each other, and there are articles on the Internet that discuss this topic, for example here and here.

This handy calculator will let us do all sorts of what if comparisons and that particular “calculator” link will convert 22.6 MB/s (the estimated transfer rate cited in the report) into the following chart.

compare_22_6_MB_s_to_Network_Speeds

As you can see it is at about the 20% level of a 1 Gb/s local area network (LAN), which is typical of many enterprise/SOHO wired (LAN) networks, and as far as “carriers” go,  some form of “optical link” will be required.  For the gory details, see this Wikipedia article on Optical Carrier transmission rates.

In practice, actual transmission rates will fall well below the theoretical rates shown above, because packets transmitted over the Internet have to transit through many switches and must share bandwidth  with other users.  Further, copying multiple small files will increase the need for “hand-shaking” messages which further decreases the effective transmission speed.  The only way to find the actual speeds that can be achieved is to run tests.  The typical ISP provided “speed test” will show optimistic speeds, but they’re a start.  The following graphic shows the result of a cable provider’s speed test.

cable-speed-test-20-miles

In that test, we accessed one of the provider’s hosts that is about 20 miles away (as the crow flies).  The 113.4 Mbits/s rate corresponds to a 14.2 MB/s rate – well below 23 Mb/s.

Here is another test, accessing a host that is on the opposite coast (3100 miles away).

cable-speed-test-3100-miles

We can see that increases in the distance traveled can have a major impact on the transmission speed.  In this test, accessing a host on the opposite coast cut the download speed by a factor of 7.

ThreatConnect, a security firm, determined that Guccifer 2 used a commercial VPN service to mask his IP address.  ThreatConnect’s analysis is described in a blog entry.  Their key finding is summarized below (emphasis added).

Now, after further investigation, we can confirm that Guccifer 2.0 is using the Russia-based Elite VPN service to communicate and leak documents directly with the media. We reached this conclusion by analyzing the infrastructure associated with an email exchange with Guccifer 2.0 shared with ThreatConnect by Vocativ’s Senior Privacy and Security reporter Kevin Collier. This discovery strengthens our ongoing assessment that Guccifer 2.0 is a Russian propaganda effort and not an independent actor.

In March of this year, Adam Carter followed up on ThreatConnect’s research [see  http://g-2.space/ in the section titled “UPDATE (12 March)”].   Adam disputes their claim that the VPN IP address used was somehow “dedicated” for use by Guccifer 2 and perhaps other hackers with connections to Russia.  Adam writes:
So… it turns out that if ThreatConnect had tried using the default options – they would have been allocated the “exclusive” IP address that was NEVER really exclusive.
They’ve caused concern and distress unduly for a VPN Service provider by misrepresenting the service and produced false-positive indicators by suggesting the IP address was used by a shady group of Russians/Guccifer2.0 with exclusivity.
The discussion above is provided as background, simply to establish that any experiment that intends to replicate Guccifer 2’s use of the Internet should use a VPN service and measure speeds over that VPN connection.
If we enable a VPN service and retry the speed test, targeting a nearby server, we see the following.
cable-speed-test-vpn-400-miles
The download speed over the VPN is roughly 60% of the speed of a direct connection.  There are probably a few reasons for this drop in speed: (1) the test no longer goes only through the provider’s network, (2) transiting the VPN server introduces another hop, (3) the VPN provider may implement bandwidth throttling, and (4) there may be additional overhead introduced by the VPN client, which is implemented in software.

Let’s fire up the calculator again and ask it to compare our 22.6 MB/s transfer rate to that seen for peripherals.

compare_22_6_MB_s_to_Peripheral_Speeds

The 23 MB/s transfer rate falls comfortably into the range of a USB 2.0 device.  It is worth noting that the actual transfer rate will be further limited by capabilities of the USB flash drive electronics.

One more, disk drives.compare_22_6_MB_s_to_Disk_Drives

Clearly, almost any disk drive can sustain 23 MB/s.

Caveat: we don’t know how accurate or current the data is that was used for that calculator.  There are lots of variables to consider, such as overhead, and especially with public networks such as the Internet other factors need to be considered: contention, rate-limiting, and so on.

We are just trying to place the 22.6 MB/s rate in perspective, and add support for the conclusion that the initial copy operation was likely done locally, either with direct access to the system where the data is stored, or over a high speed LAN.

That is not the whole story, however.  The file copy operations observed in this analysis were performed file-by-file.  There is a lot more overhead, both in file transmission and file and directory creation for file-by-file transmission than would be seen in a best case, single big file scenario.

 

13 thoughts on “MB: Mega Bytes or Mega Bits?

  1. Comments are closed. They have been open for over a month; hopefully this has given ample opportunity for readers to comment. Responding to comments is worthwhile, but time-consuming; The Forensicator needs to turn his attention to other projects. Thank you everyone who has taken the time to comment.
    — The Forensicator

    Like

  2. It’s moot since the entire analysis hinges on “fixing” the top level file time stamps by adding an hour to them without any justification.

    Like

    1. The reasoning behind adding an offset to the last modified times of the top-level files in the 7zip file is detailed in Guccifer 2.0 NGP/VAN Metadata Analysis Below is an excerpt.

      The times recorded in those .rar files are local (relative) times; this determination is detailed in the blog post, RAR Times: Local or UTC? . The times recorded in the .7z file are absolute (UTC) times. If you look at the recorded .rar file times, you will see times like “7/5/2016 6:39:18 PM” and the times in the .7z file will be at some offset to that depending on your time zone. For example, if you are in the Pacific (daylight savings) time zone, the files shown in the .7z file will read 3 hours earlier than those shown in the .rar files, as shown below.

      Time offset between 7zip and rar files

      Like

  3. why is it so important that it actually be Guccifer 2.0? If that’s all you are proving here, I can accept that. But there seems to be enough plausibility for it to have been someone in a remote location nearby.

    If it could’ve been done from the Russian Embassy by another person at those speeds, then why do we care if it’s Guccifer?

    Shouldn’t the conclusion be: “It’s highly unlikely that Guccifer 2.0 is responsible for this portion of the hack. It would have had to have EITHER been someone on site OR someone at a nearby remote location.”?

    Just trying to understand the goal of this a bit better.

    Like

    1. why is it so important that it actually be Guccifer 2.0?

      Guccifer 2 remains an enigma for many security researchers. Adam Carter at g-2.space has done a solid job of covering the controversy surrounding Guccifer 2. As to whether it is important to discover more about Guccifer 2, there are probably as many motives as there are people who care about the issue. For me, my motives run along the lines of the VIPS who are asking for formal investigations into the “Russia hacking efforts influenced the elections” narrative. Ideally, such an investigation would result in fact based public disclosures that would provide convincing evidence to support the conclusions that result from such an investigation.

      Although many security researchers have significant doubts about Guccifer 2’s legitimacy, his presence is still influencing US public policy. As recently as two weeks ago, his name came up at the prestigious Aspen Security Forum. In this Youtube video clip, one of the panelists mentions Guccifer 2 and says that “At a certain point, you would have to have blinders and ear muffs on not to know that Guccifer 2 is a Russian intelligence agent.”

      If it could’ve been done from the Russian Embassy by another person at those speeds, then why do we care if it’s Guccifer?

      There are many possible conclusions that can be drawn from the observations made in the analysis, some more probable and plausible than others. On your specific suggestion that someone at the Russian Embassy aided Guccifer 2, that would be (IMO) a pretty big deal if true. In any event, such a scenario is certainly counter to Guccifer 2’s narrative.

      Although a non-technical argument, I don’t know why the Russians would introduce additional risk by executing part of their operation on US soil, especially out of the Embassy. They know that they will be surveiled out the wazoo.

      Shouldn’t the conclusion be: “It’s highly unlikely that Guccifer 2.0 is responsible for this portion of the hack. It would have had to have EITHER been someone on site OR someone at a nearby remote location.”?

      The point of the analysis is to make its observations public so that the community/public at large can arrive at their own preferred conclusions. Or, hopefully, the study might encourage additional investigation and research.

      Like

  4. All this proves is that (s)he isn’t in Romania using a VPN. Even that isn’t conclusive as the attacker could be remoted into a states side host via RDC or other remote protocol. Essentially this proves nothing.

    Like

    1. All this proves is that (s)he isn’t in Romania using a VPN.

      If this above refers to the transfer speed estimate, that is just one part of the analysis. It is the part of the analysis that receives the most heat, but is not necessarily the most compelling factor. Consider, for example, the second copy operation done on Nov. 1, 2016, likely on the East Coast with indications that the results were written to a thumb drive. That suggests the physical presence of someone to plug in and retrieve the thumb drive. Yes, we can bring another actor into the picture to explain that observation, but we have then moved well away from the “remote Russian hacker” narrative.

      […] the attacker could be remoted into a states side host via RDC or other remote protocol

      ThreatConnect reported in their analysis that Guccifer 2 used a commercial VPN service vectoring through Russia (IIRC) for previous communications. Did he decide to use a different approach when grabbing the “NGP VAN” files? If you contemplate the use of a host close to the DNC, you’ll also have to address: (1) how did Guccifer 2 obtain access to this host? (2) how would Guccifer 2 avoid the risk of disclosing that IP address in DNC logs? (3) even though this hypothetical host is close to the DNC, can it sustain a 23 MB/s transfer rate? and lastly (4) why would Guccifer 2 introduce this additional host?

      Re: the transfer speed, although the average transfer rate was estimated at 23 MB/s, if we look at a subset of the metadata (the FEC directory and some other top-level files) which has no internal gaps and represents 40% of the total bytes transferred (869 MB), the calculated transfer rate for that chunk of files is 28 MB/s; that speed will be difficult to obtain over the Internet even with very high speed connections at both ends.

      Given those complications, some reviewers have posited a “local pivot”, where the files are first copied in bulk to a local directory on a DNC server and then uploaded back to wherever Guccifer 2 is located. As I mentioned in another comment, unexplained in that scenario is why would a remote hacker need to make that local copy, or want to? It leaves a large footprint (perhaps 20 GB per the analysis) and is unnecessary.

      Essentially this proves nothing.

      The purpose of the study is to analyze the file metadata present in the “NGP VAN” data disclosed by Guccifer 2, which he attributes to the DNC. Guccifer 2 also claims to be Romanian; a claim that has been disputed. He also claims to have obtained the data by hacking DNC servers (remotely).

      The analysis does not prove anything, but tries to reach plausible conclusions based on the data. Those conclusions generally dispute Guccifer 2’s claims. It is up to those who review the analysis to decide on the degree to which those conclusions are compelling.

      Like

  5. 22.6 megabytes is 180 megabits. 125 megabytes is 1 gigabit. How did you determine 22.6 megabytes is 80% of 125 megabytes? Comcast in the US offers speeds greater than 180 megabits to residential users. How did you determine 180 megabits is too high for an Internet connect speed when a residential ISP offers higher speeds?

    Liked by 1 person

    1. Thank you for the feedback. As you point out, the 80% figure is incorrect. That is unfortunately a misplaced comment that was meant to apply to the next section on peripherals (USB-2). Even then, that theoretical number is in fact closer to 65% (not 80% – I eyeballed the percentage – didn’t take into account that the lower scale is log). I will make the corrections.

      Although those detailed figures in this short write up were incorrect (and will be fixed), they won’t change the overall conclusions in the analysis. In practice, you’ll find that a rate somewhere between 20Mb/s and 25Mb/s is a typical speed when writing to a USB-2 flash drive.(As mentioned in the write up, file by file copy operations will slow things down to well below the theoretical speed.)

      Although many have pointed out that their Internet provider or their company’s fiber link may provide theoretical speeds that perhaps exceed 23 MB/s, we need to put this rate into perspective. Guccifer 2 claims he is a Romanian; some have claimed Russian; some have claimed neither, or even that Guccifer 2 may in fact be several people. Putting that controversy aside, ThreatConnect determined that Guccifer 2 likely used a commercial VPN service originating in France. If we accept the theory that Guccifer 2 is working out of Eastern Europe (or Russia), using a commercial VPN service as a relay to Washington, DC then I think it is fair to claim that the rate achieved will be nowhere close to 23 MB/s.

      The key point of the 23 MB/s rate is that it provides support for the conclusion that a local copy was made; that rate happens to also be consistent with a local copy to a USB-2 flash drive. Combine this with the observation that the copy was likely done on the East Coast and that ‘cp’ (inherently a local copy operation) probably was used, would produce the observed last modified time pattern. Those related observations lead directly to the conclusion that the initial copy operation was likely a local copy.

      Other observations strongly argue against Guccifer 2’s claim that he hacked the DNC — the analysis noted that a second copy operation was done on Nov 1, 2016 which built the precursors of the final 7zip. Key conclusions: (1) this second copy operation was also likely done on the East Coast and (2) those precursors (the regular files and .rar file present in the 7zip file) were likely copied to a thumb drive. It would be difficult for a hacker in Eastern Europe (or Russia) to arrange for a thumb drive to be plugged into a system on the East Coast, and we would have to ask how is this consistent with Guccifer 2’s claim that he hacked into the DNC?

      Like

  6. If the metadata gives us the exact time and date that the files were downloaded locally.. why doesn’t someone find a timetable for who entered and exited the DNC around that time. Who was in the building? Was Seth Rich?

    Like

  7. Excellent presentation for the basics. The band-width for HDisks is actually for the bus; performance of the disk itself, well, depends, but rarely comes even close to the theoretical throughput. In terms of the network throughput, there is packet overhead on the data level(s), on the network level (routing and reassembly of packets), and at the transmission level (acknowledge, re-order, status, encryption, whatever). Real world performance will generally be lower than your examples.

    Like

Comments are closed.