Author: BitMEX Research
Source: https://blog.bitmex.com/bitcoins-duplicate-transactions/
Abstract: On the Bitcoin blockchain, there are two pairs of identical transactions, one of which is sandwiched between the other, and both appeared in mid-November 2010. Overlapping transactions can cause confusion, and Bitcoin developers have been trying to combat this for years. The problem has not yet been fully resolved, and the next possible overlap may occur in 2046. Although the risk associated with overlapping transactions is small now, it is still an interesting bug worth thinking about.
Overview
A normal Bitcoin transaction spends at least one output from a previous transaction, which is a transaction identifier (TXID) that references a previous transaction. These unspent outputs can only be spent once - if they could be spent twice, you could spend the same Bitcoin twice, which would make Bitcoin worthless. However, in reality, two pairs of identical transactions have appeared on the Bitcoin chain. This is possible because coinbase transactions (transactions that allow block producers to receive block rewards) do not have any transaction inputs, but instead create new coins. Therefore, it is possible for two different coinbase transactions to send the same amount to the same address and be constructed in exactly the same way, making them identical. Because the transactions are identical, their TXIDs are also identical (the TXID is a hash digest of the transaction data). Apart from this, the only possibility for the TXIDs to overlap is a hash collision (two different preimages produce the same hash value); for a cryptographically secure hash function, this is assumed to be unlikely and impossible. For the SHA256 hash algorithm, hash collisions have never occurred in Bitcoin or anywhere else.
These two identical pairs of transactions occurred very close in time, between 08:37 UTC on November 14, 2010 and 00:38 UTC on November 15, 2010. The first pair of transactions sandwiched the second pair of transactions in between, like a sandwich. We will consider d5d2….8599 as the first pair of overlapping transactions, because it was the first to overlap, although (oddly enough) an earlier transaction, e3bf….b468 , overlapped with a future transaction.
Details of overlapping transactions
Below, you can see two screenshots from the mempool.space block explorer showing the first pair of overlapping transactions, which occurred in two different blocks.
Interestingly, if you enter the relevant URLs to query the block explorer websites, the mempool.space block explorer defaults to showing the earlier block (height 91812) when showing the first pair of overlapping transactions ( d5d2….8599 ), and defaults to showing the later block (height 91880) when showing the second pair of overlapping transactions ( e3bf….b468 ). Blockstream.info and Btcscan.org have the same results. According to our basic testing, Blockchain.com and Blockchair.com are different, they always show the newer block.
Of the four blocks involved here, only one contained a transaction other than the coinbase transaction, and that was the block at height 91812; this transaction combined 1 BTC and 19 BTC into a single output worth 20 BTC.
Can these outputs be spent?
Because there are two pairs of identical TXIDs, it is difficult for subsequent transactions to reference them. Each overlapping transaction is worth 50 BTC. Therefore, these overlapping transactions involve a total of 4 * 50 BTC = 200 BTC; or, you can say 2 * 50 BTC = 100 BTC, depending on how you understand them. From one perspective, 100 BTC do not exist. As of today, all 200 BTC have not been spent. As far as we know (not necessarily completely accurate), if someone knows the private key behind these outputs, then these bitcoins can be spent. However, once spent, these UTXOs will be removed from the database, and the 50 BTC in the relevant overlapping transactions will become unspendable (lost), so only 100 BTC can be recovered. As for which block the transaction that was spent came from (the earlier one or the later one), it is undefined and impossible to say clearly.
Before the overlapping transaction was created, the parties could spend all the bitcoins in the transaction; the parties could later create overlapping outputs, thereby creating a new entry in the database (or the set of unspent outputs). If this were true, it would mean that not only would there be overlapping transactions, but there would also be overlapping transactions with overlapping spent outputs. If this were true, it would open up the possibility of creating even more overlapping transactions, creating chains of overlapping transactions as these outputs were spent. The parties would have to be very careful to arrange the order of events to ensure that they always spent the overlapping transaction before it was created, otherwise some of the bitcoins might be lost forever. These new overlapping transactions would not be coinbase transactions, but "regular" transactions. Fortunately, this did not happen.
Problems caused by overlapping transactions
Coincidence is obviously a bad thing. It confuses wallets and block explorers, obscuring the origin of bitcoins. It also creates many possibilities for attacks and explosions. For example, you might use two overlapping transactions to pay the same person twice. Then, when the person decides to spend the funds, they find that only half are spendable. For example, this could also put exchanges at risk and make them insolvent, with no cost to the attacker because they can request a withdrawal immediately after depositing the funds.
Disallow transactions with duplicate TXIDs
In order to alleviate the problem of overlapping transactions, in February 2012, Bitcoin developer Pieter Wuille proposed BIP30 , which prohibits the creation of transactions that overlap with existing TXIDs unless the previous transaction has been completely spent. This soft fork is applied to all blocks after March 15, 2012.
In September 2012, Bitcoin developer Greg Maxwell changed the rules so that the BIP30 check should be applied to all blocks (not just those produced after March 15, 2012). The two sets of overlapping transactions mentioned above were treated as exceptions. This fixed some DoS vulnerabilities. Technically, this was another soft fork, although the rule change only applied to blocks produced 6 months ago, so it did not have any of the risks of regular protocol rule changes.
BIP30 checks are computationally intensive. Nodes must check all transaction outputs in a new block to see if any of the output endpoints already exist in the UTXO set. This is probably why Wuille designed it to only check unspent outputs. If all outputs (since the Genesis Block) were to be checked, the computation would be even more daunting, and pruning would no longer be possible. (Translator's note: Pruning mode is a node operation mode that deletes part of the historical blocks but keeps the complete UTXO set so that the new blocks can be verified.)
BIP34
In July 2012, Bitcoin developer Gavin Andresen proposed BIP34 , which was activated in March 2013. This protocol change requires coinbase transactions to include the block height, and enables versioning of the block format. The block height is added as the first element in the scriptSig (script signature) of the coinbase transaction. The first byte in the coinbase transaction script signature specifies the number of bytes used for the block height number, followed by the block height number itself. For the first 160 years ( 2 ^ 23 / (144 blocks/day * 356 day/year)
), the first byte should always be 0x03. This is why today's coinbase script signature (in hexadecimal) always starts with 03. This soft fork seems to have completely solved the duplicate transaction problem, and now all transactions should be unique.
In November 2015, after BIP34 had been adopted, Bitcoin developer Alex Morcos added a merge request to the Bitcoin Core software repository that would change the node to stop performing the BIP30 check. After all, since BIP34 had fixed the problem, the expensive check was no longer necessary. It was not known at the time, but this was technically a hard fork that would invalidate future blocks with a very small probability. Now, this potential hard fork is not important because almost no one is running node software released before November 2015. At forkmonitor.info , we are running Bitcoin Core 0.10.3, which was released in October 2015. Therefore, this is a client that follows the rules before the hard fork, and still performs the expensive BIP30 check.
Problem with block 198 3702
Later, it was discovered that some coinbase transactions in blocks before BIP34 was activated had script signatures that coincided with future (BIP34) valid blocks. Therefore, although BIP34 fixes the problem in most cases, it is not 100% fixed. In 2018, Bitcoin developer John Newbery listed a complete list of these possible overlaps, as follows:
Old Block | New Block | Estimated date for the new block to arrive |
---|---|---|
209,920 | 209,921 | 28/11/2012* |
176,684 | 490,897 | 21/10/2017* |
164,384 | 1,983,702 | 13/01/2046 |
169,895 | 3,708,179 | 28/10/2078 |
170,307 | 3,709,183 | 03/11/2078 |
171,896 | 3,712,990 | 30/11/2078 |
172,069 | 3,713,413 | 03/12/2078 |
172,357 | 3,714,082 | 07/12/2078 |
172,428 | 3,714,265 | 09/12/2078 |
183,669 | 3,761,471 | 02/11/2079 |
196,988 | 4,275,806 | 12/08/2089 |
174,151 | 5,208,854 | 11/05/2107 |
201,577 | 5,327,833 | 14/08/2109 |
206,039 | 7,299,941 | 11/02/2147 |
206,354 | 7,299,941 | 11/02/2147 |
- Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd -
Notes on “*”: These blocks were mined in 2012 and 2017, and their coinbase transactions do not overlap. Block 209921 (the 79th last block of the first halving) cannot overlap because BIP30 checks were still in place at that time.
Annual table of possible number of overlapping coinbase transactions
- Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd -
Therefore, the next block with a possible overlapping transaction is 198 3702, which is expected to be generated in January 2046. The block with a height of 16 4384 was generated in January 2012, and its coinbase transaction sent 170 BTC to 7 different addresses. Therefore, if a miner in 2046 wants to launch such an attack, he must not only be able to mine this block, but also burn nearly 170 BTC in fees, and the total cost will be slightly higher than 170 BTC (including the opportunity cost of the 0.09765625 BTC block subsidy). Based on the current Bitcoin exchange rate of $88,500, this will cost $15 million. As for who owns the 7 addresses in this coinbase transaction in 2012, it is unknown, and there is a high probability that these private keys have been lost. All 7 outputs of this coinbase transaction have been spent, and 3 of them were spent in the same transaction . We believe that these funds may be related to the Pirate40 Ponzi scheme, although this is only speculation. Therefore, this attack seems not only very expensive, but also almost completely useless. It costs a very large amount of money to kick a node running 31 years of software off the network.
(Translator's note: The logic here is that BIP34 does not prohibit the occurrence of overlapping transactions, only BIP30 prohibits it. Therefore, if overlapping transactions are created at the 198 3702 height, it will only cause trouble for the nodes that are still running the BIP30 check - it will be forked away from the network composed of nodes that no longer run the BIP30 check. Of course, it can also be said that at present, we have no clear standards on how to deal with overlapping transactions.)
The next potentially dangerous block may have a coinbase transaction that overlaps with the block at height 169985 that appeared in March 2012. The output of this coinbase transaction in history was only slightly more than 50 BTC, far less than 170 BTC. The 50 BTC is of course related to the block subsidy at the time, and in the future, in 2078, when there is a risk of overlap, the block subsidy will be much lower. Therefore, in order to exploit this vulnerability, miners have to burn about 50 BTC in fees, which cannot be recovered because they must be sent to the output that appeared in the transaction in 2012. No one knows the price of Bitcoin in 2078, but this attack may also be extremely expensive. Therefore, this issue may not be a major risk facing Bitcoin, but it is still a concern.
Since the 2017 SegWit upgrade, coinbase transactions can also include a commitment to all transactions in the block. These pre-BIP34 blocks do not include witness data commitments. Therefore, in order to produce a duplicate coinbase transaction, miners need to remove all transactions that spend SegWit outputs from the block, which further increases the opportunity cost of the attack because there are fewer transactions that can be included in the block to pay the fee.
in conclusion
The double transaction bug does not appear to be a major security issue for Bitcoin, as it is very difficult, expensive, and rare to exploit. However, it is still an interesting problem, given the timescales and interesting properties of double transactions. Nonetheless, developers have spent a lot of time on this problem, and the date 2046 may have been buried in the minds of some developers as a deadline to fix it. There are a number of ways to fix this, which may require a soft fork. One possible solution is to make the SegWit commitment (included in the coinbase transaction) mandatory.
(over)