Methodology

We concentrate on a tiny aspect of wallet security and to better understand what we do and do not do, this page explains the rough process of how we work.

What we do

To put it dramatically, we search for the obvious potential to empty all the wallets of all the users at once. Could the provider of the wallet, with enough criminal energy, defraud all its users at once, without this being possible to be detected before it is too late? (If he could in theory, then a sufficiently motivated criminal could also put him under duress to steal your funds or manipulate him into stealing your coins with social engineering or with a backdoor.)

This horror scenario is possible whenever the provider can obtain a copy of the wallet backup and thus access all the users’ funds at once. He could collect the backups and once the amount of coins he could access stops growing, empty all the wallets in one big transaction. This form of scam got known as “retirement attack”.

Seeing that some wallets have millions of users, it is plausible to assume that some wallets manage billions of dollars. This would be a huge incentive for criminally inclined employees, even if the wallet was not set up to scam its users from the start, which probably is the case for some wallets, too.

What we do not do

Our manual review goes as follows:

We take the perspective of a curious potential user of the respective product. We take all information from publicly available sources as we do not assume that potential users would sign NDAs prior to using a wallet. We also do not consider hard to find information. Our verdict therefore is based on what we can find within a few clicks from the product’s description. We occasionally search GitHub for the identifiers but without endorsement from the official website, any repository we find this way is not very promising to provide reproducible builds but we are happy to leave an issue on a source code repository about our findings.

We answer the following questions usually in this order:

Did we get to a conclusion on the transparency of this product yet? If not, we tag it Development 

This product still needs to be evaluated some more. We only gathered name, logo and maybe some more details but we have not yet come to a conclusion what to make of this product.

Is the product still supported by the still existing provider? If not, we tag it Defunct! 

Discontinued products or worse, products of providers that are not active anymore, are problematic, especially if they were not formerly reproducible and well audited to be self-custodial following open standards. If the provider hasn’t answered inquiries for a year but their server is still running or similar circumstances might get this verdict, too.

Was the product updated during the last two years? If not, we tag it Obsolete! 

Bitcoin wallets are complex products and Bitcoin is a new, advancing technolgy. Projects that don’t get updated in a long time are probably not well maintained. It is questionable if the provider even has staff at hands that is familiar with the product, should issues arise.

This verdict may not get applied if the provider is active and expresses good reasons for not updating the product.

Was the product updated during the last year? If not, we tag it Stale! 

Bitcoin wallets are complex products and Bitcoin is a new, advancing technolgy. Projects that don’t get updated in a year are probably not well maintained.

This verdict may not get applied if the provider is active and expresses good reasons for not updating the product.

Is this product the original? If not, we tag it Fake! 

The bigger wallets often get imitated by scammers that abuse the reputation of the product by immitating its name, logo or both.

Immitating a competitor is a huge red flag and we urge you to not put any money into this product!

Is this product available yet? If not, we tag it Un-Released 

We focus on products that have the biggest impact if things go wrong and while pre-sales sometimes reach many thousands to buy into promises that never materialize, the damage is limited and there would be little definite to be said about an unreleased product anyway.

If you find a product in this category that was released meanwhile, please contact us to do a proper review!

Do many people use this product? If not, we tag it Few Users 

We focus on products that have the biggest impact if things go wrong and this one probably doesn’t have many users according to data publicly available.

Is it a wallet? If not, we tag it No Wallet 

If it’s called “wallet” but is actually only a portfolio tracker, we don’t look any deeper, assuming it is not meant to control funds. What has no funds, can’t lose your coins. It might still leak your financial history!

If you can “have” bitcoins in it but not actually receive or send them, then what really happens is that you can get exposure to the Bitcoin exchange rate but we still don’t consider it a wallet.

Is it for bitcoins? If not, we tag it No BTC 

At this point we only look into wallets that at least also support BTC.

Can it send and receive bitcoins? If not, we tag it No send/receive 

If it is for holding BTC but you can’t actually send or receive them with this product then it doesn’t function like a wallet for BTC but you might still be using it to hold your bitcoins with the intention to convert back to fiat when you “cash out”.

All products in this category are custodial and thus funds are at the mercy of the provider.

Is the product self-custodial? If not, we tag it Custodial! 

A custodial service is a service where the funds are held by a third party like the provider. The custodial service can at any point steal all the funds of all the users at their discretion. Our investigations stop there.

Some services might claim their setup is super secure, that they don’t actually have access to the funds, or that the access is shared between multiple parties. For our evaluation of it being a wallet, these details are irrelevant. They might be a trustworthy Bitcoin bank and they might be a better fit for certain users than being your own bank but our investigation still stops there as we are only interested in wallets.

Products that claim to be non-custodial but feature custodial accounts without very clearly marking those as custodial are also considered “custodial” as a whole to avoid misguiding users that follow our assessment.

This verdict means that the provider might or might not publish source code and maybe it is even possible to reproduce the build from the source code but as it is custodial, the provider already has control over the funds, so it is not a wallet where you would be in exclusive control of your funds.

We have to acknowledge that a huge majority of Bitcoiners are currently using custodial Bitcoin banks. If you do, please:

  • Do your own research if the provider is trust-worthy!
  • Check if you know at least enough about them so you can sue them when you have to!
  • Check if the provider is under a jurisdiction that will allow them to release your funds when you need them?
  • Check if the provider is taking security measures proportional to the amount of funds secured? If they have a million users and don’t use cold storage, that hot wallet is a million times more valuable for hackers to attack. A million times more effort will be taken by hackers to infiltrate their security systems.

Is the source code publicly available? If not, we tag it No Source! 

A wallet that claims to not give the provider the means to steal the users’ funds might actually be lying. In the spirit of “Don’t trust - verify!” you don’t want to take the provider at his word, but trust that people hunting for fame and bug bounties could actually find flaws and back-doors in the wallet so the provider doesn’t dare to put these in.

Back-doors and flaws are frequently found in closed source products but some remain hidden for years. And even in open source security software there might be catastrophic flaws undiscovered for years.

An evil wallet provider would certainly prefer not to publish the code, as hiding it makes audits orders of magnitude harder.

For your security, you thus want the code to be available for review.

If the wallet provider doesn’t share up to date code, our analysis stops there as the wallet could steal your funds at any time, and there is no protection except the provider’s word.

“Up to date” strictly means that any instance of the product being updated without the source code being updated counts as closed source. This puts the burden on the provider to always first release the source code before releasing the product’s update. This paragraph is a clarification to our rules following a little poll.

We are not concerned about the license as long as it allows us to perform our analysis. For a security audit, it is not necessary that the provider allows others to use their code for a competing wallet.

Is the decompiled binary legible? If not, we tag it Obfuscated! 

When compiling source code to binary, usually a lot of meta information is retained. A variable storing a masterseed would usually still be called masterseed, so an auditor could inspect what happens to the masterseed. Does it get sent to some server? But obfuscation would rename it for example to _t12, making it harder to find what the product is doing with the masterseed.

In benign cases, code symbols are replaced by short strings to make the binary smaller but for the sake of transparency this should not be done for non-reproducible Bitcoin wallets. (Reproducible wallets could obfuscate the binary for size improvements as the reproducibility would assure the link between code and binary.)

Especially in the public source cases, obfuscation is a red flag. If the code is public, why obfuscate it?

As obfuscation is such a red flag when looking for transparency, we do also sometimes inspect the binaries of closed source apps.

As looking for code obfuscation is a more involved task, we do not inspect many apps but if we see other red flags, we might test this to then put the product into this red-flag category.

Is the published binary matching the published source code? If not, we tag it Unreproducible! 

Published code doesn’t help much if it is not what the published binary was built from. That is why we try to reproduce the binary. We

  1. obtain the binary from the provider
  2. compile the published source code using the published build instructions into a binary
  3. compare the two binaries
  4. we might spend some time working around issues that are easy to work around

If this fails, we might search if other revisions match or if we can deduct the source of the mismatch but generally consider it on the provider to provide the correct source code and build instructions to reproduce the build, so we usually open a ticket in their code repository.

In any case, the result is a discrepancy between the binary we can create and the binary we can find for download and any discrepancy might leak your backup to the server on purpose or by accident.

As we cannot verify that the source provided is the source the binary was compiled from, this category is only slightly better than closed source but for now we have hope projects come around and fix verifiability issues.

Does the binary we built differ from what we downloaded? If not, we tag it Reproducible 

If we can reproduce the binary we downloaded from the public source code, with all bytes accounted for, we call the product reproducible. This does not mean we audited the code but it’s the precondition to make sure the public code has relevance for the provided binary.

If the provider puts your funds at risk on purpose or by accident, security researchers can see this if they care to look. It also means that inside the company, engineers can verify that the release manager is releasing the product based on code known to all engineers on the team. A scammer would have to work under the potential eyes of security researchers. He would have to take more effort in hiding any exploit.

“Reproducible” does not mean “verified”. There is good reason to believe that security researchers as of today would not detect very blatant backdoors in the public source code before it gets exploited, much less if the attacker takes moderate efforts to hide it. This is especially true for less popular projects.

What is a hardware wallet?

There is no globally accepted definition of a hardware wallet. Some consider a paper with 12 words a hardware wallet - after all paper is a sort of hardware or at least not software and the 12 words are arguably a wallet(‘s backup).

For the purpose of this project we adhere to higher standards in the hardware wallet section. We only consider a hardware wallet if dedicated hardware protects the private keys in a way that leaves the user in full and exclusive control of what transactions he signs or not. That means:

Our steps when reviewing a hardware wallet

We try to follow the spirit of the software review process, looking at the firmware and its updates for public source and reproducibility.

In addition we look at physical properties of the device.

Are the keys never shared with the provider? If not, we tag it Provided Keys 

The best hardware wallet cannot guarantee that the provider deleted the keys if the private keys were put onto the device by them in the first place.

There is no way of knowing if the provider took a copy in the process. If they did, all funds controlled by those devices are potentially also under the control of the provider and could be move out of the client’s control at any time at the provider’s discretion.

Can the device sign transactions? If not, we tag it Leaks Keys 

Some people claim their paper wallet is a hardware wallet. Others use RFID chips with the private keys on them. A very crucial drawback of those systems is that in order to send a transaction, the private key has to be brought onto a different system that doesn’t necessarily share all the desired aspects of a hardware wallet.

Paper wallets need to be printed, exposing the keys to the PC and the printer even before sending funds to it.

Simple RFID based devices can’t sign transactions - they share the keys with whoever asked to use them for whatever they please.

Can the user verify and approve transactions on the device? If not, we tag it Bad Interface 

These are devices that might generate secure private key material, outside the reach of the provider but that do not have the means to let the user verify transactions on the device itself. This verdict includes screen-less smart cards or USB-dongles.

The wallet lacks either an output device such as a screen, an input device such as touch or physical buttons or both. In consequence, crucial elements of approving transactions is being delegated to other hardware such as a general purpose PC or phone which defeats the purpose of a hardware wallet.

The software of the device might be perfect but this device cannot be recommended due to this fundamental flaw.

Priorities

We cannot re-evaluate all the wallets every hour and as this is a side-project still, we might not be able to update anything for a month or three straight.

But when we update reviews, we try to proceed as follows:

  1. Re-evaluate new releases of Reproducible  wallets as they become available. If users opt for a wallet because it is reproducible, they should be waiting for this re-evaluation before updating.
  2. Check if any of the Unreproducible!  wallets updated their issues on their repositories.
  3. Make general improvements of the platform
  4. Evaluate the most relevant Development  wallets

Wrap it up

In the end we report our findings. All wallets that fail at any of the above questions are considered high risk in our estimate. We might contact the wallet provider, try to find out what went wrong and report on the respective communication.

In the end, even if we conclude not to trust a wallet this doesn’t mean the wallet was out to steal your coins. It just means that we are confident that with enough criminal energy this wallet could theoretically steal all the funds of all its users.

No reproducible apps on Apple App Store?

WalletScrutiny started out looking only into Android. Mobile wallets are the most used wallets and Android the most used among mobile wallets but looking into iPhone wallets was high on the list from the start.

For Android, the process of reproducing builds was relatively clear and some apps did this before we started the project. For iPhone this was not the case. Reproducibility of iPhone apps was an open question.

One year passed. We asked around. Nobody could reproduce any iPhone app.

At this point we shift the burden of proof onto the providers (or Apple). If you want people to trust your app (or platform), explain how it can be audited. We will move on in the meantime and list iPhone apps with an empty reproducible section until then.

Else, our methodology is the same as for Android wallets.

Further considerations

We will list as we stumble into them things like

What could still go wrong?

The verdict Reproducible  unfortunately means very little. It means that at the random point in time that we decided to verify the code to match the binary, the code actually did match the binary. It does not mean that the next update will or that the prior one did and it does not mean that the reproducible code is not doing evil things.

In fact, we believe the most likely scenario for an exit scam is that the wallet would bait-and-switch. It would see to how many users it could grow the product or even buy out a successful wallet in financial trouble to then introduce code to leak the backups.

The evil code would not be present until the product is losing users (or funds under management) for whatever other reason.

Any stamp of approval, any past security audit or build verification would be obsolete. Therefore we don’t see our mission as fulfilled when all wallets are reproducible. There is a long road ahead from there. For users running reproducible wallets, the wallets would need actual code audits – Before releasing the binary to its users.

To put things into perspective, reviewing the code some 5 developers put out is a full time job. Testing the reproducibility of a wallet is an hour of work the first time and thanks to automation, 5 minutes for every update.

To achieve a situation where most users are running verified apps, the release process would have to be massively decelerated and there would have to be strong incentives in place for security researchers to find issues.

Often users are in a big hurry to get bug-fixes and wallet managers are in a big hurry to roll out new features but this hurry is standing against the security of all wallet users. Wallet developers “screw up” all the time and almost always it’s just some crash affecting some corner case they didn’t anticipate when writing the code and these crashes, while highly inconvenient for the users who expected to use their wallet now, usually do not put at risk any funds in the wallet. This hurry does, however, put reviewers in the uncomfortable position of having to approve something that would need more review. Most reviewers are reviewing the work of their colleagues and trusting them is kind of expected at least by the colleagues themselves but all it takes is one slip up and the code might be compromised. And compromising code in ways that go unnoticed by an auditor is kind of a sport.