synology-decrypt
(require synology-decrypt) | package: synology-decrypt |
This package implements a library and command-line client to decrypt files encrypted by the Synology Cloudsync program.
It only supports password based decryption.
It only supports version 3.1 of the encryption format.
1 Running the program
racket main.rkt <encrypted file path> <decrypted output path>
When run like this, the program will read standard input to obtain the decryption password. Standard input is read to avoid leaking the password in shell history or in-memory process information. Type the password and press Ctrl+D to close standard input.
Run with --help to see all the options.
2 Reference
This package can also be used as a library.
procedure
(decrypt-ports password input-port output-port [ #:external-lz4 use-external-lz4?]) → (or/c string? #f) password : string? input-port : input-port? output-port : output-port? use-external-lz4? : boolean? = #f
If use-external-lz4? is #t, the lz4 is used. If it is missing in the system path, an error is raised.
The external lz4 program can be slightly faster for hundreds of MBs of data.
procedure
(decrypt-file password input output #:external-lz4 use-external-lz4?) → void? password : string? input : path-string? output : path-string? use-external-lz4? : (or/c boolean? 'decide)
If use-external-lz4? is 'decide, some heuristics are used to decide whether to use internal or external lz4 based on the size of the file. If it is a boolean? the semantics are that of decrypt-ports.
3 Encrypted File Format
The Synology Cloudsync encrypted file is a binary file describing structured data. The structured data begins with some magic bytes identifying the file, then a series of dictionaries. Dictionary keys are strings, while values can be integers, byte strings, strings or nested dictionaries.
3.1 Syntax
The syntax can be described with this BNF grammar.
| ‹file› | ::= | ‹magic› ‹magic-hash› ‹dictionary›+ |
| ‹magic› | ::= | __CLOUDSYNC_ENC__ |
| ‹magic-hash› | ::= | d8d6ba7b9df02ef39a33ef912a91dc56 |
| ‹dictionary› | ::= | 0x42 ‹dictionary-entry›* 0x40 |
| ‹dictionary-entry› | ::= | ‹key› ‹value› |
| ‹key› | ::= | ‹string› |
| ‹value› | ::= | ‹string› | ‹bytes› | ‹int› | ‹dictionary› |
| ‹string› | ::= | 0x10 ‹bytes-with-len› |
| ‹bytes› | ::= | 0x11 ‹bytes-with-len› |
| ‹int› | ::= | 0x01 ‹int-with-len› |
| ‹bytes-with-len› | ::= | ‹length› byte?+ |
| ‹length› | ::= | unsigned short, big endian encoded (2 bytes) |
| ‹int-with-len› | ::= | 1 byte length followed by length bytes (in practice always 1) |
3.2 Semantics
The encryption scheme uses the password to derive a 32-byte key and 16-byte initialization vector (IV). The Key Derivation Function is the one used by OpenSSL. The key+IV is used to decrypt enc_key1 using AES in Cipher Block Chaining (AES-CBC) mode.
The decrypted enc_key1 will be a hex string. It should be converted to bytes and fed back into the Key Derivation Function. This will result in the final key+IV used to decrypt the actual file contents.
The first dictionary in the file describes the encryption information. It consists of the following key-value pairs:
"compress" - always 1, indicating the file has been compressed before encryption.
"digest" - TODO
"enc_key1" - base64 encoded, encrypted key. The user’s password can be used to decrypt this to derive the decryption key.
"enc_key2" - TODO
"encrypt" - always 1, indicating the file is encrypted.
"file_name" - string representing the original file name.
"key1_hash" - TODO
"key2_hash" - TODO
"salt" - a string containing the salt used for key derivation.
"session_key_hash" - TODO. Used to validate the encryption key integrity.
- "version" - a (sub) dictionary:
"major" - 3
"minor" - 1
Once the encryption information is obtained, the remaining file (except for the last dictionary) consists of data dictionaries which have two key-value pairs:
"type" - always "data".
"data" - bytes representing encrypted chunks.
The data values are always 8192 bytes long, except the last one. The way encryption is performed is to pass the original file through AES-CBC (which is a stream cipher) using the key+IV derived above. The data is padded with PKCS7 padding as part of the process. Then the encrypted data is split in 8192 byte chunks. If "compress" is true (1), the original file is compressed using LZ4 before encryption.
This means, to decrypt the data, one must feed all the chunks into a stateful AES decryption routine, initialized with the key+IV above. Then the data must be run through LZ4 decompression to obtain the original file.
Finally, the last dictionary in the file is an additional meta-data dictionary that contains the MD5 checksum of the decrypted (original) file.
3.3 Key Derivation Function
TODO Describe in words.
; helper (define (openssl-kdf-iter password salt key-size iv-size key-iv-buffer temp) (let ([repeat-count (if (eq? 0 (bytes-length salt)) 1 1000)]) (if (< (bytes-length key-iv-buffer) (+ key-size iv-size)) (let ([temp (repeated-hash (bytes-append temp password salt) repeat-count)]) (openssl-kdf-iter password salt key-size iv-size (bytes-append key-iv-buffer temp) temp)) key-iv-buffer))) (define (openssl-kdf password salt key-size iv-size) (let* ([count 1000] [key-iv-output (openssl-kdf-iter password salt key-size iv-size (bytes) (bytes))]) (list (subbytes key-iv-output 0 key-size) (subbytes key-iv-output key-size))))
4 References and Acknowledgements
All the work to reverse engineer the file format was done by others.
Marnix Klooster’s Python synology-decrypt implementation. This is the original reverse engineering of the encryption scheme and file format as far as I know. It was immensely useful!