Myce.com Latest Updates

Microsoft OneDrive for Business modifies files as it syncs

Posted at 17 April 2014 22:22 CET by Seán Byrne

While we often hear about privacy concerns with storing data in the cloud such as on Dropbox, one thing we take for granted is data integrity, where files are not altered in any way on the cloud unless the user actually modifies them online.  For example, if a user syncs a spreadsheet file with Google Docs, the file stored on Google drive should be an exact byte for byte match with the original file until the user either modifies the cloud file in Google Docs or the locally stored file in a spreadsheet application.  In fact, many consumers go as far as trusting the cloud as their only backup.

Microsoft OneDrive for Business (formerly SkyDrive Pro) is Microsoft’s workplace equivalent of OneDrive and comes bundled with most Office 365 subscriptions.  It is designed to give the business control over the employee’s data stored within the synced folders.  However, unlike the consumer version of OneDrive, we found out by accident that what gets synced to the cloud is generally not the same as what gets synced back from the cloud, even when no one has touched the files online or elsewhere.

When OneDrive got stuck in an endless loop of trying to sync a few files and the issue returned when I tried clearing its cache as instructed on Microsoft’s discussion forum, I decided to stop syncing the OneDrive folder and backed it up.  I then deleted the original synced folder and got OneDrive to start syncing it again, so it would get a fresh copy from the cloud.  In an aim to check if any files got damaged due to the earlier syncing issue, I used a utility called MD5summer to create MD5 hashes for its content and repeated this process for the freshly synced folder.  To my surprise, the vast majority of the files showed ‘Checksum did not match’.  Surely most of my files haven’t gone corrupt?

I then started opening various files that failed the MD5 check, but could not find any obvious damage to any file.  That was until I noticed several PHP files from a website theme that also failed the MD5 check.  When I compared them side by side in Notepad++, I noticed straight away a few pieces of code injected into the header that clearly could not have been caused by any form of data corruption.  I knew for sure that neither I nor anyone else would have made these changes as the theme files were from a former website CMS package, so I then tried finding out what was modifying these files.

To check if OneDrive for Business was the culprit, I created a handful of mostly empty files of different types I frequently use and handwrote a simple PHP file and HTML file in Notepad++, so any modifications would clearly stand out.  I then used MD5summer to create MD5 hashes and then placed these files in a folder for OneDrive for Business to sync.  A few hours later, I booted my laptop which also has OneDrive for Business installed and a moment later, this folder appeared.  I then ran MD5summer and this is what I got:

OneDrive for Business MD5 tests
The following highlighted in red is what OneDrive for Business injected into the HTML file:

OneDrive for Business HTML File Modification
While ‘uuid’ stands for Universally unique identifier, this code “C2F41010-65B3-11d1-A29F-00AA00C14882” remains the same in every PHP and HTML file it modified, including with other users.  Even though this modification does not make the file traceable, this is obviously going to be a nuisance for web developers who use OneDrive for Business to sync web files with each other, especially handwritten files where they don’t expect extra code to be added.

As for Word, Excel and Publisher files (‘docx’, ‘xlsx’ and ‘pub’ file extensions), these grew by about 8KB.  Unlike the web files, these Microsoft Office files had what appears to be uniquely identifiable code added, potentially making it possible to match them to a company and possibly even to a specific user’s account.  To get an idea of what was added, I used 7-Zip to extract the content of the Word file before and after syncing.  There were two ‘.rels’ files and one XML file modified and three folders with files added – ‘customXml’ containing 6 XML files, a folder ‘_rels’ inside this containing three ‘.rels’ files and a ‘[trash]‘ folder containing a ‘0000.dat’ file.  In the ‘docProps’ folder, a file ‘custom.xml’ contains a property with a ‘ContentTypeId’ name attribute with a unique ID.

When I used 7-zip to look inside the two Microsoft Publisher files, the synced Publisher file had a ‘MsoDataStore’ folder added in it, inside which contains 3 folders with gibberish names and 2 XML files inside each.  I found the same ContentTypeID code inside as the Word file and while it matched, it was different to that in files I compared with other users.

Follow us on Facebook / Twitter / Google Plus

Even though OneDrive for Business modified these files, it left the ‘Date Modified’ attribute in every file unchanged, so to an unsuspecting user who just checks when the files were modified, they appear untouched.  For example, the Word file shows a modified time of ’16:14:14’ for both the original and synced file, even though the file sizes are clearly different.  The only files that remain untouched are those that were placed in the synced folder on the original computer, so even if a user checks the files they place in a synced folder, they would not know anything is being modified unless they physically took those files to another computer with the matching synced folder to compare them.

So what this means is that people who use OneDrive for Business or SharePoint need to be very careful with what they sync with it, especially those handling third party data due to confidentiality issues.  For example, if an employee needs to transfer confidential files that absolutely must not be touched between its laptop and PC and decides to do so through a synced folder in OneDrive for Business, those files will end up being inadvertently modified without the user’s knowledge.  This could have severe consequences if let’s say a file is used as evidence in a court case.  How do you prove that the company did not intentionally modify it?

Based on Myce testing, we found that the consumer version of OneDrive (formerly SkyDrive) does not appear to any modify files, whether synced with the desktop product or through the web interface.  We also tested BitTorrent Sync and found that it does not modify any files either, even when testing a 1GB folder with a wide range of file types.

Got any questions on cloud syncing, backup or storage?  Please discuss them in our File Sharing forum.

 

Discuss this in our related forum

Click to share

There are 27 comments

Achant
MyCE Junior Member
Posted on: 17 Apr 14 22:57
    Their media player (the last time I used it) also modified .mp3 files. It meant my files were showing as corrupt after playing them because I kept md5 info on them.
    I think it was updating, or clearing some **** data but couldn't see anything. I found nothing to prevent this, other than making the files read only before playing.
    Achant
    MyCE Junior Member
    Posted on: 17 Apr 14 23:02
      I wasn't swearing above about the data. Don't know why it put in the ****. The word, unless I typed it wrong was m.e.t.a. data
      Wombler
      Administrator & Reviewer
      Posted on: 17 Apr 14 23:09
        Great detective work there Seán with some scary implications particularly for privacy.

        Makes you wonder why Microsoft wants to make the documents traceable and identifiable.


        Wombler
        bean55
        Moderator
        Posted on: 17 Apr 14 23:14
          Mr Gates must have worked for Uncle Sam at one time
          ILLP
          MyCE Senior Member
          Posted on: 18 Apr 14 01:34
            Quote:
            Originally Posted by Wombler
            Great detective work there Seán with some scary implications particularly for privacy.

            Makes you wonder why Microsoft wants to make the documents traceable and identifiable.


            Wombler
            Privacy this was one of the concerns when skydrive was first released now called onedrive and this article just supports what many of us thought in the beginning MS can have their cloud storage.
            debro
            Blown to smitherines
            Posted on: 18 Apr 14 02:25
              While chances are that all these modifications are automatic, it demonstrates that Microsoft does have access to, and is modifying, these documents in the supposedly encrypted storage.

              How long will it be until someone is indited for publishing something that they didn't write, but was modified once uploaded into cloud storage.

              If cloud storage must be used, your data has to be encrypted by you, before it's uploaded to the storage provider.

              Massive fail for cloud providers in general, and Microsoft in particular.
              CDan
              MyCE Resident
              Posted on: 18 Apr 14 06:04
                I was just looking at Tresorit for secure cloud storage. Not that I ever considered a MS solution, but this makes it a lot less likely I will. Anybody have anything good or bad to say about Tresorit?
                Wombler
                Administrator & Reviewer
                Posted on: 18 Apr 14 14:18
                  Regardless of the reasons though I don't think I'd trust any organisation that adds identifying information to my own files.


                  Wombler
                  Ibex
                  CDFreaks Resident
                  Posted on: 18 Apr 14 16:38
                    Is anyone genuinely surprised that this was happening?

                    Would be bad enough if it was just their free serivice...

                    Looks like businesses will be forced to only use services which allow them to overlay their own encryption, with locally stored keys, and strong legal protection to guarantee that their data will never leave the EU.

                    And if your business is required by law to keep verifiable records...
                    You could end up in serious trouble.

                    Welcome to the world of everything-as-a-service computing. This is just the begining.
                    Wombler
                    Administrator & Reviewer
                    Posted on: 22 Apr 14 14:51
                      I'm not remotely surprised but then it wouldn't surprise me either if it eventually turned out that the NSA had some involvement in this as well.

                      People's files shouldn't be modified by cloud services without some form of justifiable, and very visible, form of explanation IMO.


                      Wombler
                      RTV71
                      MyCE Member
                      Posted on: 24 Apr 14 05:32
                        I think OneDrive for Business is using SharePoint and the added UUID is SharePoint behavior for document tracking and control. They should add an option to turn it off.
                        choose-another
                        New Member
                        Posted on: 24 Apr 14 10:43
                          OneDrive for Business _is_ SharePoint - it's the new new name for SharePoint Workspace to be more precise. Syncing with a SharePoint server in the MS cloud. It is thus a completely different thing from OneDrive (consumer) and why MS marketing chose to confuse it this way I guess we'll never understand.

                          SharePoint is a DMS not a filesystem, and it syncs document management metadata from, and _to_, the documents (depending on file format).

                          In container file formats this is done in metadata, not content, sections - your document content is not touched (try office documents with digital signatures - the signature will remain valid, because it validates the content areas of the file format, not metadata). In some file formats it will add comments in a way that does not affect the content (as above).

                          That is all this is, SharePoint on-premise will do exactly the same thing, and it is a documented SharePoint feature for at least a decade - see e.g. http://weblogs.asp.net/bsimser/archive/2004/11/22/267846.aspx
                          Thue
                          New Member
                          Posted on: 24 Apr 14 10:57
                            > So what this means is that people who use OneDrive for Business or SharePoint need to be very careful with what they sync with it

                            Ok, this just seems absurd by me. People who use OneDrive needs to run away screaming, not "be very careful". Really.
                            DoMiN8ToR
                            Management
                            Posted on: 24 Apr 14 12:49
                              Nevertheless, it would have been much more logical to put meta data in a separate file instead of touching the original files. It even renders .html and .php files unusable.
                              YouDontWantToKnow
                              New Member
                              Posted on: 28 Apr 14 16:47
                                It's been in SharePoint (the underpinnings of OneDrive for Business) for several releases. Read more about it here - http://msdn.microsoft.com/en-us/libr...ffice.14).aspx. The post is click bait.
                                DoMiN8ToR
                                Management
                                Posted on: 28 Apr 14 16:53
                                  Quote:
                                  Originally Posted by YouDontWantToKnow
                                  It's been in SharePoint (the underpinnings of OneDrive for Business) for several releases. Read more about it here - http://msdn.microsoft.com/en-us/libr...ffice.14).aspx. The post is click bait.
                                  It's certainly no clickbait. From your post you can at least conclude it's bad marketing from Microsoft as the new name certainly gives different expectations. Also adding meta data to a file that renders it unusable is not really a great 'feature'.
                                  roadworker
                                  MyCE Resident
                                  Posted on: 28 Apr 14 17:15
                                    Quote:
                                    Originally Posted by DoMiN8ToR
                                    Also adding meta data to a file that renders it unusable is not really a great 'feature'.

                                    Well said,DoMi!!!
                                    BigJobbies
                                    New Member
                                    Posted on: 23 May 14 15:35
                                      What I find particularly concerning is that it also modifes the contents of password-protected files.
                                      The fact that a password-protected ".xlsx" file is NOT a zip file suggests that perhaps the whole thing is encrypted, not just key files within the ZIP. If so (and I *do* hope I am wrong) it suggests that there could be a password-free back-door into the encrypted files that SharePoint uses.
                                      DoMiN8ToR
                                      Management
                                      Posted on: 23 May 14 15:48
                                        I certainly hope you're wrong BigJobbies and welcome to the forum
                                        tonypitt
                                        New Member
                                        Posted on: 31 Aug 14 18:18
                                          This same problem has started occurring this past week on OneDrive for consumers. It makes me wonder if Microsoft has migrated that product to the same platform/technology as OneDrive for Business. The support forums for OneDrive on Microsoft are now teeming with people try to figure out why everything is broken for their Office files synced by OneDrive.
                                          Seán
                                          Senior Administrator & Reviewer
                                          Posted on: 31 Aug 14 18:38
                                            Thanks for letting us know tonypitt and welcome to Myce http://upload.cdfreaks.com/seanbyrne...y_shamrock.gif

                                            I'm very surprised to hear they started doing this with consumer files, particularly since online backup seems to be one of their main selling points for their Office 365 Home Personal/Premium subscription service. Modifying files being "backed up" is not really a back up, since modified files are no longer considered originals.

                                            We'll have a check into this to see if this happens on our end.
                                            tonypitt
                                            New Member
                                            Posted on: 31 Aug 14 18:41
                                              You can see a lot of others with this problem at the URL below:

                                              http://answers.microsoft.com/en-us/o...um?tab=Threads

                                              The problem seems to have started sometime around 8/27. The problem manifests itself when creating a file on one machine and then having it synced to another. If one goes to OneDrive on the web, you can download the file just fine, but any office file (particularly Excel files) synced automatically is reported as corrupted.

                                              I'm wondering if perhaps this only affects those with the 1 TB version of OneDrive for consumers, but have no way to test that out.
                                              DoMiN8ToR
                                              Management
                                              Posted on: 31 Aug 14 18:49
                                                @Seán, will you write about it? Thanks for reporting this Tony, very interesting!
                                                Seán
                                                Senior Administrator & Reviewer
                                                Posted on: 31 Aug 14 20:47
                                                  From my testing, syncing files between PCs in OneDrive (consumer version) appears to be fine with over 10 file types I tried, i.e. between two Windows 7 PCs and between two Windows 8.1 PCs. It meant I had to convert my Windows 8.1 local account into a Microsoft account, as SkyDrive for Windows 8 does not work in a local account.

                                                  However, I was able to replicate the reported corruption bug. This is different to the story I reported here where OneDrive for Business adds metadata to each file as this time it doesn't seem to be a metadata issue, i.e. when the file goes corrupt, it simply cannot open.

                                                  Edit: I was able to replicate this in Excel only, so I posted an article with a video recording to demonstrate it.

                                                  Many thanks for reporting this.
                                                  tonypitt
                                                  New Member
                                                  Posted on: 01 Sep 14 04:08
                                                    Great work on this Sean. Some new things have surfaced in users testing this over on the Microsoft site. Several people have claimed that when the Excel file syncs there is a brief moment when the full file appears to be present on the secondary machine, then the file size drops dramatically and the file is reported as corrupted. I can't replicate that, but my PCs and Internet connection are pretty fast.

                                                    If one elects to password protect the file, everything works just as it should. No file corruption.

                                                    As you noted, this issue at least seems Windows 8/8.1 specific. Part of me wonders if this really is a OneDrive problem or if OneDrive is actually a "victim" of some other Windows service.

                                                    I'm not suggesting this is specifically what the problem is, but it almost seems like something that might happen if a virus scanner or similar service on a Windows 8/8.1 machine examines Office files and somehow corrupts the Excel files in the process. Perhaps something about the way that OneDrive puts files in the file system triggers some kind of check that runs amuck.

                                                    My reason for speculating the above is, as you have noted, the files that are actually on the OneDrive cloud are not corrupted. If they are explicitly downloaded to the user's machine, they work just fine. It is only placement through the automatic file sync process that causes this problem.
                                                    Seán
                                                    Senior Administrator & Reviewer
                                                    Posted on: 01 Sep 14 10:56
                                                      I'll try to do some further tests later today if I get time.

                                                      I also seem to be a victim of the Windows 8.1 update bug. I don't think I rebooted my laptop since the last Windows Update process did so. When I rebooted yesterday evening, it got stuck in a BSOD loop for a few iterations and then went through an automatic system restore. This in turn caused Office 2013 to require an online repair, which took most of the evening due to my limited DSL speed.

                                                      One test I'd like to try is boot my desktop into Windows 8.1 and modify an Excel on my OneDrive folder to see if that also results in the Excel file becoming corrupt on my laptop. I'll also try various other tests, such as temporarily disabling MalwareBytes and Windows Defender (Windows 8's built-in Antivirus), doing a fresh sync, etc.

                                                      At work, we've already stopped using OneDrive for Business over a month ago. My work colleagues were having files going corrupt (not just recently) as well as complete crashes of OneDrive for Business (i.e. require the folder structure & cache to be removed and a fresh resync.) We're now using Google Drive and so far haven't had a single issue with it and it's also far less resource intensive than OneDrive for Business was. As for the personal version of OneDrive, I just use mine for testing only. I mainly use BitTorrent sync (this only syncs between devices, not to any online storage) and Dropbox (my phone came with 25GB for 2 years.)
                                                      Seán
                                                      Senior Administrator & Reviewer
                                                      Posted on: 02 Sep 14 23:22
                                                        Just to update on this discussion, the OneDrive for consumers file corruption issue has been reported fixed.

                                                        I have also ran PC to PC sync test with Google Drive and Dropbox and neither of these modify files as they sync, at least across 13 file types I tested with.

                                                        For curiosity sake, I did check what happens if I open Word and Excel files Online in the OneDrive consumer version to see if that causes any file modifications to be made and indeed it does, at least with Excel files.

                                                        Word Online: Opening a file in the online viewer appears to have no effect. Opening the file in the online editor does cause the file to be modified without typing a single keystroke. So don't open Word files in the online editor unless you make a backup.

                                                        Excel Online: Excel does not appear to offer a viewer mode online, i.e. opening an Excel file online without typing a single keystroke will cause the Excel file to be modified and thus modify locally stored versions!

                                                        So although the OneDrive consumer does not appear to modify files synced between PCs, it can modify certain file types that are opened online (Excel in particular) without the user making a single modification.

                                                        Post your comment

                                                        You need to register before you can comment

                                                        Like us

                                                        Most popular headlines

                                                        Nexus Player to be codenamed Fugu and powered by Intel Atom SoC (updated)

                                                        Traces in the Android source code give us hints that the next Nexus device will ...

                                                        Free software decrypts and converts Blu-ray disc to 2% of its size with nearly same quality

                                                        The Spanish company CineMartin claims to have developed software that makes it p...

                                                        Windows 7 no longer sold to consumers - all about Windows 8.1 now

                                                        Microsoft will no longer sell computers with Windows 7 installed starting t...

                                                        First tests of Cinemartin MyBD show it does what it promises

                                                        Our firsts with the free version of Cinemartin MyBD indicate that the software d...

                                                        Office 16 for Mac images leak online

                                                        We've heard a lot about the next version Office, codenamed Office 16, but every ...

                                                        See all headlines
                                                        Follow Myce.com