Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is fishy. The entire point of encrypted data is that one message cannot be distinguished from another without decryption (which would require the private key). In other words, the entire premise shouldn't work. This means that either bad encryption is being used (i.e. statistical information about the data is leaked) or the good results we see are just noise. Or, the whole thing is a scam to get funding: the best algorithms are planted, and the company just shuffles BTC between accounts it controls.


The data is homomorphically encrypted, meaning you can do operations (such as add and subtract) on the ciphered message and it will also perform them on the underlying data.


Yes, I realize that. The issue is that the result of any operation is also encrypted, which means that there should be no way to connect the target of the training data (encrypted or not) to the output of a function of encrypted data. Suppose the unencrypted data is (a,b,c) where a+b=c, and (x,y,z)=encrypt((a,b,c)). We have an addition function plus on encrypted data such that decrypt(plus(x,y))=a+b=c=decrypt(z), but it is not the case that plus(x,y)=z (at least, not if plus is computable in polynomial time, and assuming the encryption scheme is sound). If it were, we could statistically distinguish encrypt((a,b,c)) from encrypt((rand(),rand(),rand())) which would mean the encryption is not sound.


They could just be normalizing every data point to be between 0 and 1 by dividing by the range. That's a homomorphic encryption.. it passes your weird assumptions.

I don't know why you're harping on about sound encryption, the point of this is to keep the statistical information intact in the cipher, without giving away the underlying market data.


It's very weird to call data normalization "encryption". This is just a standard procedure done on most datasets. Encryption implies they've gone to extra processing to make sure you can't figure out what the variables mean.

I think it's either an abuse of the word 'encryption'. That, or they really have done something weird to this dataset. Which will probably make it useless for statistical algorithms. Even normalization destroys a lot of useful information.


They only need to go as far as to obfuscate the market data this was derived from, so they don't have to pay exchange licencing fees.

It doesn't have to have an exponential time complexity on decryption to qualify as 'encryption'. Multiplying by 2 could be considered homomorphic encryption.

You might think encryption means something else and that it's an abuse of the word but unlike the spy novella that you derive this impression from, these guys actually are ex-spies.


> They only need to go as far as to obfuscate the market data this was derived from, so they don't have to pay exchange licencing fees.

Thanks for explaining this, I was struggling to figure out the difference between this and https://www.quantopian.com/

That's actually pretty clever.


What I'm talking about is indistinguishability [0], [1] which shows up in definitions of homomorphic encryption (e.g. [2], [3]). If the data is only normalized, or even encrypted in an order-preserving way, it seems possible to figure out information about the underlying data (e.g. if the target is whether a symbol moves up or down, and if you can figure out what even one of the features refer to, there's enough information to turn your model on the data into predictions you can just trade on).

[0] https://en.wikipedia.org/wiki/Computational_indistinguishabi... [1] https://en.wikipedia.org/wiki/Ciphertext_indistinguishabilit... [2] http://cs.au.dk/~stm/local-cache/gentry-thesis.pdf [3] https://arxiv.org/ftp/arxiv/papers/1305/1305.5886.pdf


I read through the blog posts, and it seems like the encryption is order-preserving. It's designed to leak enough information to be useful in prediction, but not enough for users to trade on it independently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: