24 thoughts on “Guccifer 2.0 NGP/VAN Metadata Analysis

  1. Let me play devil’s advocate for a moment. How do we know that the 9-1-2016 6:45PM copy was when the files were copied off the server? The files could have been extricated some time prior and copied within the attacker’s system using the cp command at that moment. This may have been only the last of several cp copies. And these hacking groups have been know to adopt sleep schedules to match their target’s timezone. It’s not inconceivable that hackers in Russia would have their computers set to US Eastern time.

    Where we need to go from here is to examine the system logs on the server and look at the shutdown and startup times. If we find that the Windows server (I assume it was Windows, they’re Democrats) was shutdown just before the start of the copy and came back online shortly after it finished (if they show an unusually long reboot of highly coincidental timing) then we can be very confident that it was an inside job. This would also require that the server has a USB3 port to connect a suitably fast flash drive. But if the logs show that the server was running smoothly right through that time period then it would not contradict the Russian-hacker theory. A Linux server running smoothly at the time could support either theory.

    Crowdstrike presumably still has the harddrive images. And they claim to have sent copies to the FBI. Either could quickly check the logs and settle the question in five minutes.

    Like

    1. Well, Crowdstrike and the FBI have already examined all of this and released their findings months ago. We are unlikely to hear more from them barring any publicly disclosed information resulting from the Mueller investigation. I wouldn’t hold your breath for that. Also, it is doubtful that CrowdStrike still has the images. The FBI certainly does, but once the investigation was concluded, CrowdStrike was likely required to destroy them (standard practice).

      Like

      1. “Well, Crowdstrike and the FBI have already examined all of this and released their findings months ago.”

        Per Comey’s testimony, as I understand it, he said that the DNC denied access to their servers, even after being asked repeatedly (“at multiple times and several levels”) by the FBI. Comey also stated that they (the FBI) depended upon Crowdstrike for analysis of the servers and the (alleged) hacks. Crowdstrike declined an invitation to appear at a Congressional hearing subsequent to Comey’s testimony. If you have a different understanding please follow up, ideally with cites.

        “Also, it is doubtful that CrowdStrike still has the images.”

        Update: Previously, I said: “I have not seen/heard any statements/testimony by Crowdstrike that they made images […]”. Recently, Alec Dacyczyn followed up with a cite to a July 5, 2017 WT article,
        http://www.washingtontimes.com/news/2017/jul/5/dnc-email-server-most-wanted-evidence-for-russia-i/
        which states,
        “In May 2016 CrowdStrike was brought to investigate the DNC network for signs of compromise, and under their direction we fully cooperated with every U.S. government request,” a spokesman wrote. The cooperation included the “providing of the forensic images of the DNC systems to the FBI, along with our investigation report and findings. Those agencies reviewed and subsequently independently validated our analysis.”

        My main questions would run along the lines: (1) which systems were imaged, (2) were they full images, or excerpts (such as providing only the artifacts that CS found of interest), (3) when were they made, would the time interval include both claimed Russian hacks, (4) who were the other agencies?

        This news that images were provided to the FBI and perhaps other agencies is coming late in the game, and was somehow omitted by Comey and others in testimony. Further, an email to the WT is a somewhat surprising (and weak) method of making this information public.

        Note: the term “images” usually refers to bit-for-bit (a literal copy of all blocks inclusive of deleted file data) copies of the drives in question. Here, “drives” might mean hard drives, SSD drives, USB drives, DVD’s, CD’s, floppies — basically any electronic media that can be imaged.

        “The FBI certainly does,”

        If they have, I haven’t seen/heard where the FBI said that. Please share a cite, if you have it.

        “but once the investigation was concluded, CrowdStrike was likely required to destroy them (standard practice).”

        In my experience, the DNC might have the court/tribunal direct the FBI to destroy their copies (if they have them) after the trial/investigation has been concluded but not before. For Crowdstrike, it is the DNC’s decision — it is the DNC’s data. Good practice might be to hang onto it for 3/so years, just in case something else comes up. At their choosing, they could decide to do something like destroy all laptop images, or retain only logs, hacking artifacts and so on – their choice.

        It isn’t clear that the FBI performed their own independent investigation. Instead, the FBI decided to “check in” with Crowdstrike and then decided that no further action was needed, or so it seems.

        Liked by 1 person

    2. “How do we know that the 9-1-2016 6:45PM copy”

      The first copy was on 7-5-2016 at 6:45 PM. The second copy was on 9-1-2016.

      A way to look at this report is that it asks the question does the available data support the scenarios/conclusions claimed? It is not that other scenarios aren’t possible, and readers are welcome to state their opposing theories here. I may not challenge them point-for-point though, because we would just be arguing one speculation against the other and one person’s experience against the other. Ultimately, the readers/reviewers can decide for themselves whether the conclusions in this report seem plausible.

      “Where we need to go from here is to examine the system logs on the server and look at the shutdown and startup times. If we find that the Windows server (I assume it was Windows, they’re Democrats) was shutdown just before the start of the copy and came back online ”

      To date, to the best of my knowledge (correct me if I am wrong): Neither the DNC, the FBI, nor any other source that might be in a position to know have acknowledged Guccifer 2, a hack that might be attributed to Guccifer 2, nor have they confirmed/denied that the data/docs released by Guccifer 2 originated in the DNC or a related organization. The NGP/VAN company denies that the “0 day” vulnerability claimed by Guccifer 2 exists.

      On the face of it, only Guccifer 2 claims that he successfully hacked the DNC.

      If you refer to the material in http://g-2.space and elsewhere you will see reports that Crowdstrike was on site as early as late April, 2016 per CS’s own reports they “mitigated” the alleged hack(s) by re-installing software on all systems inclusive of each individual’s laptops. CS does not say if they made image copies of hard drives, preserved logs or backups, and so on. If such actions were *not* taken, no one will be able to access the relevant logs and other relevant files now.

      When you say “on the server”, you seem to be suggesting that this analysis presupposes that a server might have been rebooted and a USB drive plugged into the server? That may be the case, but taking a server offline is fairly disruptive and might be noticed. Besides, these days services are often run on VM’s on a server, and taking down one physical server may take down many business processes and *that* will probably get someone’s attention.

      Instead, I contemplated rebooting an employee’s desktop PC. Here, two scenarios are considered: 1. An employee’s desktop is rebooted and files are copied over the LAN, or 2. the data is copied directly from the employee’s desktop PC’s hard drive.

      Alternatively, a laptop is brought in by the individual performing the collection; it may have Linux installed on it already, or a Linux USB drive is plugged into it and the laptop is rebooted into Linux. This latter idea has some appeal because you don’t have to commandeer someone’s desktop computer. On the other hand, as some have suggested, if the content of the “NGP VAN” 7zip has little to do with “NGP/VAN” (apart from a few spreadsheets and reports here and there) looks more like the dump of some Dem worker’s work product (Documents directory), then the collection can be made by going into that person’s office/cube, rebooting their desktop PC, and copying off the data. No servers required, no logs made, no authentication needed. After hours, the day after a 3-day July 4 weekend might be a good time to do that.

      You state: “Crowdstrike presumably still has the harddrive images. And they claim to have sent copies to the FBI”. I missed that Crowdstrike claim. Can you provide a cite? It doesn’t square with the Comey’s testimony that the FBI was denied access to the DNC servers.

      It is possible that DNC might have had servers or VM’s running Linux. Linux-based systems might even run the NGP-VAN software for all I know. They might serve up users’ mail and their shared home directories. If a Linux based server was accessed, it would certainly be easy to find the Unix ‘cp’ command on that system — and a USB device can be plugged directly into that system without a reboot (there would probably be a log entry though, if anyone cared to check it).

      Like

      1. Yep, you’re right. I mixed up the July and September dates in my post.

        I KNEW I remembered Crowdstrike saying something about images. But it took me a while to sift through the junk-news to find it again. Here, half way down:
        http://www.washingtontimes.com/news/2017/jul/5/dnc-email-server-most-wanted-evidence-for-russia-i/

        “In May 2016 CrowdStrike was brought to investigate the DNC network for signs of compromise, and under their direction we fully cooperated with every U.S. government request,” a spokesman wrote. The cooperation included the “providing of the forensic images of the DNC systems to the FBI, along with our investigation report and findings. Those agencies reviewed and subsequently independently validated our analysis.”

        I assume they were referring to full block-device level harddrive images.

        Like

      2. Thanks, that is an interesting disclosure. For those who didn’t click through the Wash Times URL, that article was posted quite recently on 7/5/2017. Yes “image copies” is equivalent to “bit-fot-bit” (or “block-by-block”) image copies. In the article, it was difficult to determine when exactly the images were mode. Crowdstrike was on scene at the DNC as early as April, 2016 per some reports. Anyway, it appears that the images were made well ahead of the 7/5/2016 date that the timestamps indicate that Guccifer 2 took the (so-called) NGP-VAN data.

        Like

  2. I’d encourage you to post a link to this comment area at the top of the main post (perhaps in that Acknowledgements area) so that readers can easily see dissenting viewpoints. As for my feedback on your analysis:

    1) I would suggest removing the leading commentary from your conclusions. Statements like “The data was likely initially copied to a computer running Linux” are misleading – that is only one possibility and in my professional opinion, not the most likely. I’d suggest changing statements like this to “The data may have initially been copied…”

    2) I know you somewhat address this is a follow-up post, but 23MB/s is not anywhere near out of the realm of possibility for remote file transfers, especially not for large organizations or government agencies. My clients with international connections easily reach these speeds, so they are not *necessarily* indicative of a local transfer.

    3) You mention that the files were copied individually, not as a single large package. This can actually help speed up remote transfers, as multiple files can be sent synchronously, bypassing a lot of the bottlenecking you can experience in international peering.

    4) All of the above is somewhat of a non-issue in my experience. It would actually be relatively uncommon for individual files to be exfiltrated in this manner. *Far* more common would be for them to be collected on a local machine under remote control, packaged nicely, then exfiltrated as a single package. Depending on the level of security, this can be accomplished in a single big transfer, or the package can be fragmented to speed up the transfer.

    5) If the files were collected locally before being extracted, this would easily explain the EDT times, the FAT timestamps, and the NTFS timestamps. None of this indicates one way or the other whether the attacker was local or remote. It is impossible to tell from any of this evidence, and suggesting otherwise is disingenuous.

    6) The conclusion that this also involved a USB drive and a Linux OS is also likely flawed. As you point out, ‘cp -r’ is an easy explanation, but booting to Linux is not the only way to accomplish this type of transfer. Many remote access tools use ‘cp’ and ‘scp’ as the base for their file copy tasks. This would leave the timestamps in exactly the format you describe. In my experience, it is *very* common to see this sort of timestamp in a breach investigation.

    7) The scenario you envision, frankly, is overly complex and unlikely. It is, in my opinion, far more likely that a remote attacker utilized a single breached DNC machine to locate and collect the desired data, did so using their attack tool (rather than RDP and drag+drop), and packaged it all for exfiltration on that machine. This would be supported by all of the evidence you describe and matches the most common breach scenarios we’ve seen over and over again.

    Overall, I think your investigation of the data is good. You pull out some interesting information and were thorough in your research. However, your analysis seems tainted by the intent to draw specific conclusions from this data. Looked at objectively, the most likely scenario supported by your data is not the one you propose. This article could be rewritten to be very informative without the obvious slant and doing so could make it a valuable resource for those interested in the information. As it stands now, however, the bias in your conclusions makes the analysis difficult to take at face value, because the reader is left having to separate technical evidence from personal bias.

    I hope you’ll consider re-writing (or at least amending where the evidence supports other potentially more likely possibilities) because this is certainly research that is worth a read. If you can separate your personal feelings from the technical analysis and conclusions, this would be worth submitting to a journal for peer review, rather than leaving it sitting on an anonymous blog.

    I hope this feedback was useful, if for no other reason than to present a different viewpoint.

    Like

    1. “3) You mention that the files were copied individually, not as a single large package. This can actually help speed up remote transfers, as multiple files can be sent synchronously, bypassing a lot of the bottlenecking you can experience in international peering.”

      The saying goes: “In theory, the difference between theory and practice is small. In practice, the difference between theory and practice is large.”

      The problem is that ‘cp’ and its close cousin ‘scp’ are simple, non-threaded programs. They are *not* Robocopy or FileZilla and if they were they would preserve the last mod times.

      I encourage you to run a few experiments and get back to us with both positive and negative results.

      “4) All of the above is somewhat of a non-issue in my experience. It would actually be relatively uncommon for individual files to be exfiltrated in this manner. *Far* more common would be for them to be collected on a local machine under remote control, packaged nicely, then exfiltrated as a single package. Depending on the level of security, this can be accomplished in a single big transfer, or the package can be fragmented to speed up the transfer. ”

      Far more common, in my experience is for them to be copied over the wire and not deposited in a local directory first. A local directory leaves a foot print. A 20G directory leaves a *big* footprint. That, and it is an unneeded extra step to make a local copy of the data.

      Something like this on Unix:

      $ tar cfz – file://server//NGP-VAN | ssh BASE1 tar xfz –

      There’s a lot of ways to do that; the command is intended as an example that the files can be streamed over the ‘net without the need to make a local copy of NGP-VAN.

      Below, is something like what you’re describing.

      $ cp -r file://server//NGP-VAN .
      $ zip NGP-VAN
      $ rm -rf NGP-VAN

      and then transfer NGP-VAN.zip back to Romania. This will produce NGP-VAN’s last mod pattern created by ‘cp’ when the zip file is ultimately unpacked. Again, why did you make a local copy?

      Like

    2. “5) If the files were collected locally before being extracted, this would easily explain the EDT times, the FAT timestamps, and the NTFS timestamps. None of this indicates one way or the other whether the attacker was local or remote. It is impossible to tell from any of this evidence, and suggesting otherwise is disingenuous.”

      Before I answer, please clarify/restate: “If the files were collected locally before being extracted, this would easily explain the EDT times, the FAT timestamps, and the NTFS timestamps.” Outline your proposed scenario in enough detail that we can follow it, and comment on it. Explain how that scenario supports your claims.

      The analysis doesn’t say: “With 100% certainty the attacker was not “remote”. It says that the fact pattern indicates a local copy was made and the file times in that local copy showed the pattern of using ‘cp’, which is primarily used for local copying operations. It further states that the effective transfer rate of 23 MB/s is too fast to support the idea of file-by-file copying back out over the Internet (although that would an unusual way to use ‘cp’, but allows for the use of ‘scp’).

      Readers can decide, or opine, on whether they think it makes sense that a hacker would first make a local copy of the files before shipping them offsite, which creates a big intermediate directory and will add more time to the overall operation. *That* does, IMO, seem to me like an extra step added to fit the facts.

      A big hurdle that anyone claiming Guccifer 2 hacked the DNC (either in the way he claimed or otherwise) has to explain why neither the DNC, the FBI, nor Crowdstrike, nor NGP-VAN supports the claim that Guccifer 2 hacked the DNC. In fact the DNC hasn’t acknowledged that the files on the disclosed NGP-VAN .7z file are DNC’s files. That is one pretty strong reason to come into the analysis with a “not a hack” bias.

      “6) The conclusion that this also involved a USB drive and a Linux OS is also likely flawed. As you point out, ‘cp -r’ is an easy explanation, but booting to Linux is not the only way to accomplish this type of transfer. Many remote access tools use ‘cp’ and ‘scp’ as the base for their file copy tasks. This would leave the timestamps in exactly the format you describe. In my experience, it is *very* common to see this sort of timestamp in a breach investigation. ”

      On this point, “Many remote access tools use ‘cp’ and ‘scp’ as the base for their file copy tasks.” If the host runs Linux/UNIX, I can accept that statement, because UNIX has those commands already installed. I can’t see why they’d bother shipping in ‘cp’ because Windows has “COPY” already. ‘scp’ maybe, but I’d like to hear that you/others have either seen this in practice or see a document that supports that statement.

      When you say ” it is *very* common to see this sort of timestamp in a breach investigation. “. Was that a breach of a Windows based system? Did you also see the hackers making a large local copy of a (20G) directory before shipping it out?

      Like

    3. “7) The scenario you envision, frankly, is overly complex and unlikely. It is, in my opinion, far more likely that a remote attacker utilized a single breached DNC machine to locate and collect the desired data, did so using their attack tool (rather than RDP and drag+drop), and packaged it all for exfiltration on that machine. This would be supported by all of the evidence you describe and matches the most common breach scenarios we’ve seen over and over again.”

      Complex (and simple) are always in the eye of the beholder. Rather than debating the vague quality of complexity, let’s clearly state our cases and let others decide on which of the two interpretations of the facts matches up with their experience and their sense of what makes sense to them.

      Here is what I see as a simple scenario, in the paragraphs below.

      First, we assume that this was not a hack. We come in with that bias because no one who should know is saying G2 hacked DNC. Maybe they have their reasons (ongoing investigation, etc).

      Our bias won’t matter anyway, if the facts don’t support it.

      We note that fast transfer times support the idea of a local copy. We discard the idea of making a temp copy locally, because it seems unnecessary (more complex) and in my experience hackers work hard *not* to leave big footprints. 20G (or even 2G) is a big footprint.

      We note last mod time patterns that are consistent with the use of the ‘cp’ command, which is a Unix command. Linux is Unix. Bootable Linux drive images are widely available; they are easily burned to a USB drive. They are commonly used by IT admins, pen testers, forensics types, and hackers (said it).

      So, we think: let’s look at “boot Linux from a USB drive”. Is that simple? Before answering let’s decide if phishing, hacking a firewall, escalating privileges sufficient to access someone’s Documents directory, or some network file share is “simple”? I’ll say “no”.

      To me the idea of an insider going to an employee’s desktop PC, on the day after a 3 day July 4 weekend, after hours, booting a Linux USB drive and then taking 15 minutes to copy off a big directory/two is simple. No hack, no authentication, no logs. Alternatively, you might access a network share. For that though, you’ll probably need authentication. As an insider you can side step that, esp if you have some sort of network admin privileges. Maybe you’ll leave some log entries behind you, but with a 60 day retention policy there won’t be any when you release the docs 2.25 months later.

      In passing, did it occur to anyone at the DNC, that they should download the NGP-VAN 7zip file produced by Guccifer 2 and take a look? Those 7/5/2016 dates are pretty obvious. Would that prompt them to check their logs? Would it prompt them to track down the locations where the data in that .7zip file can be found?

      Note: I’m not saying this is what happened, just that the facts both support the scenario and don’t negate it. If we saw a 2 MB/s transfer rate, I would back off the idea of local copy. If we saw a 200 MB/s transfer rate, I’d say that there is something wrong with the metadata.

      “I hope this feedback was useful, if for no other reason than to present a different viewpoint.”

      Yes, thanks for taking the time to provide detailed counter-points and for encouraging discussion.

      Like

      1. “To me the idea of an insider going to an employee’s desktop PC, on the day after a 3 day July 4 weekend, after hours, booting a Linux USB drive and then taking 15 minutes to copy off a big directory/two is simple.”

        Let me clarify here, before someone starts warming up the phasors. Only Guccifer 2 has stated that the files disclosed in the NGP-VAN 7zip are from the DNC and were somehow obtained as the result of exploiting vulnerabilities in NGP/VAN or the DNC firewall. Please read the statement above as hypothetical. We don’t know; it might be from some DNC ex-employee’s backup drive inadvertently left on the counter of the Starbucks across the street from the DNC. We don’t even know whether the data can be authenticated as coming from the DNC.

        The hypothetical above is based on the same premise as the “remote hack” theory — someone named Guccifer 2 collected DNC data, presumably from behind the DNC firewall, and this data was later disclosed on Sept 13, 2016.

        Like

      2. “To me the idea of an insider going to an employee’s desktop PC, on the day after a 3 day July 4 weekend, after hours, booting a Linux USB drive and then taking 15 minutes to copy off a big directory/two is simple. No hack, no authentication, no logs. Alternatively, you might access a network share. For that though, you’ll probably need authentication. As an insider you can side step that, esp if you have some sort of network admin privileges.”

        The act itself sounds simple but fitting it into context could generate a great deal more complexity. How many DNC employees would you say there were who could have conceivably accomplished this? What are some likely motives for carrying out this act and are any of them consistent with the data?

        I’ve seen many people suggest some manner of whistleblower scenario, usually relating to favoritism the DNC showed to Hillary Clinton over Bernie Sanders. That a DNC employee would randomly decide to steal a great deal of data on the off chance of finding something incriminating that they could leak just for the sake of becoming a whistleblower sounds to me like a rather far fetched scenario . Our hypothetical whistleblower would more likely be privy to DNC malfeasance prior to accessing the server and then later downloaded the data in order to obtain evidence of that malfeasance along with other useful information that may have been in the same directory. This in turn narrows down the suspect pool. We need a person who might have become privy to the emails concerning undermining the Sanders campaign who was not sympathetic to this end and had the kind of access needed to steal from the server.

        I can think of two other scenarios. Someone actively infiltrating the DNC by becoming an employee for the purpose of stealing information on behalf of some third party or a previously loyal DNC employee flipped by an outside motivation. Both are complicated and push the boundaries of plausibility especially since there is no evidence to support either.

        Then there’s further complexity generated by fitting it into the greater context of the investigation, as well as national and global politics.

        If there was no hack, would this not imply that Crowdstrike was lying to the FBI? What would compel a cybersecurity firm to commit such a felony? How does one even going about contracting a firm for that purpose?

        This is to say nothing of the backdrop of Russian funded electioneering campaigns and other documented hacking attempts of Government and Near Government organizations.

        You go on to suggest that the data may not have even come from the DNC which adds further layers of complication. You said: “We don’t know; it might be from some DNC ex-employee’s backup drive inadvertently left on the counter of the Starbucks across the street from the DNC. We don’t even know whether the data can be authenticated as coming from the DNC.” I assume this particular scenario was in jest but it raises the question what’s a plausible alternative scenario?

        So what at first appears physically and technically simple in context becomes logistically complex.

        Like

      3. You make some good points and thanks for your reasoned response. The purpose of the study is to analyze the available metadata, make observations, and to some speculate based on the observations. The point of the speculations is to illustrate whether the analysis supports or disputes the claim that Guccifer 2 hacked the DNC and then published the “NGP VAN” 7zip file. In that context, we have to assume (1) the 7zip file represents data derived from a DNC source, (2) there may have been a hack or leak.

        We make those assumptions because we are trying to test Guccifer 2’s claim that he hacked the DNC, then obtained the “NGP VAN” data and later disclosed it. So yes, the suggestion that this data might have been derived from an ex-employee’s backup thumb drive was partly in jest, but also to remind us all that we don’t know the actual source of the data. For the purpose of this study we need to follow Guccifer 2’s claims, because we are testing the veracity of those claims.

        The study doesn’t try to speculate on whether there was a whistle blower, an insider, or even an agent of some state government. It simply disputes the scenario claimed by Guccifer 2 that there was a hack initiated from Eastern Europe or Russia. The point of describing a scenario involving a boot to USB with Linux was mainly to illustrate a feasible and reasonable scenario that fits the facts.

        “If there was no hack, would this not imply that Crowdstrike was lying to the FBI?”

        As far as Crowdstrike goes, from what I recall Crowdstrike stated that they found indicia of malware which they attribute to two alleged Russian sponsored hacking groups (COZY BEAR and FANCY BEAR). They were not able to determine if any information had been ex-filtrated. IIRC, Crowdstrike never made any claims re: Guccifer 2 and therefore did not link Guccifer 2 to the alleged Russian hacks of the DNC. If you/others have information to the contrary, please post a reply.

        Thus, based on the public record, there is no information that shows that Crowdstrike lied to anyone. Their findings were sufficiently limited to make people question whether Crowdstrike’s findings fully support the conclusions that Russian sponsored groups hacked the DNC, later leaked DNC documents and emails to Wikileaks — all in an effort to influence the election in favor of a Trump election victory.

        Like

  3. “I’d encourage you to post a link to this comment area at the top of the main post (perhaps in that Acknowledgements area) so that readers can easily see dissenting viewpoints.”

    Good idea. It make take a day/two, I’ll update the article per your suggestion.

    “1) I would suggest removing the leading commentary from your conclusions. […]”

    Thanks for the suggestion, but document will stay as is unless major technical issues are found or clarifications are needed. After each conclusion, there is a statement regarding the basis for the conclusion. Hopefully, that helps.

    “2) […] 23MB/s is not anywhere near out of the realm of possibility for remote file transfers, especially not for large organizations or government agencies. […]”

    The analysis report notes that if you take the last mod time stamps of all the constituent files (after unpacking the top .rar files) they’re all compressed into a 14 minute period with significant gaps amounting to 13 minutes. The analysis follows the theory that the 13 minutes of gap represent files that were copied, but left out of the final .7zip file. If we look at only the files copied and use their total size (in bytes) divided by (elapsed – total gaps), we get a transfer time of 23 Mbytes/sec. Because of the pattern of the last mod dates we conclude that a command like Unix’s ‘cp’ was used.

    Some people have suggested that the first copy operation might been out to a location close to the DNC and then those files were copied from there. Let’s call that location BASE1.

    Let’s first note that the usual use of ‘cp’ is a local copy operation. There is another form of ‘cp’ called ‘scp’ (secure copy); it works pretty much like ‘cp’ but can go remote. It will require some setup on BASE1, but it would be the natural way create the ‘cp’ pattern of last mod times while copying over the net. it might look like this:

    scp -q -r ‘NGP-VAN’ BASE1:

    (above: “-q” for ‘quiet’, “-r” for ‘recursive’)

    In practice, ‘BASE1’ might be the IP address of our clandestine server.

    Nothing wrong with that — fits the facts: a last mod pattern with the appearance of file-by-file copy. You’ll get a tail wind, because by default ‘scp’ will encrypt the data and most compression algorithms compress the data before encrypting it. The rationale is that encryption can be cpu-intensive, slow operation; performing it on less data might make things go faster as long as your compression algorithm runs faster than your encryption algorithm. You get the extra benefit that the content of your packets will be difficult to sniff on the wire.

    Now the only thing you’ll need to do is to time it. What you’ll find is file-by-file copy will slow things down a lot. How much is a lot? Some testing is needed, but 3x to 10x worse is possible. File-by-file copy introduces file and directory creation overheads that have nothing to do with communication transfer speeds, though the back-and-forth handshakes for each file do introduce overhead.

    The bottom line is that just because you have a fast link, you may not come close to hitting its peak transfer rates because there are other overheads involved. If I have some time, I will try and back up those claims. Otherwise, I encourage you to try a few experiments. Ideally using the actual NGP VAN 7zip data. Try it on your local net.

    Like

  4. Have not read all or your article, but I would not use “touch” on any of the files,as you are modifiying the timestamp( I know you know that…) ! That is you are corupting your own data. If the goal is to adjust to a different timezone then change your TZ env var. If you are using cygwin, it has a package with all the timezone datafiles in the world, it takes a while to install, but I always installed it. cygwin is great, but thankfully I have not needed to use it since I live in fedora or centos now.

    Thanks for you good work!

    Like

    1. The problem is that you have two sets of files: (1) those that come from the .7zip file and those contained in the .rar files. For example, CIR.zip is in the 7zip file and has a last mod date of 7/5/2106 3:52:00 PM (when opened in the Pacific Time Zone). This needs to be advanced by 3 hours to agree with the times in the .rar files (which show local time).

      Generally, it is not a good idea to change metadata when reviewing evidence, but in this case you have files with two differing time representations (UTC for the 7zip, local for the .rar files). We need to adjust the .7zip files to bring them into agreement, as they would appear when the files were originally copied.

      If that doesn’t quite make sense, consider that if the .7zip file were opened on the East Coast that the last mod date/times would fall into the same range as those shown in the .rar files. That’s because the 7zip GUI will adjust the UTC times to the equivalent time in the time zone in force when you open the 7zip file. As an aside, most Windows based programs don’t act on the TZ setting.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s