Reverse-engineering QNX 1.2 to boot from HDD

This is a guest post from Mihai Gaitos, A winning entry for Virtualization Challenge IV – Act II – QNX 1.2 HDD Boot ($2000 prize).

Since a lot of people (especially Zir Blazer) tried to use the available QNX 1.2 / QNX 2 tools to install a HDD boot loader and load the existing kernel, I decided to take a different approach and build a new loader. At first I was under the impression that maybe a BIOS disk driver was already present in the kernel. After realizing that there was no HDD driver included, I decided to try reverse-engineering the relevant parts of QNX.

Starting from the start (boot sector) helped me extract the kernel from the boot diskette and analyze it just enough to validate that it’s the right thing and the assumed entry point is correct.

In order to make things easier (and because it was a fun project per se) I wrote a somewhat simple QNX filesystem access tool that enabled me to extract files from the diskette and HDD images.

Going for mount

Afterwards, my main activity was centered on mount. As opposed to typical Linux/Unix mount, here it also loads the HDD driver. After finding out the executable file format ( I wrote another small tool to extract the code and data segments of QNX executables. Analyzing the disassembly, I have determined what operations mount performs in order to install the driver and mount a QNX partition. The main steps are:

  • load the driver file contents into malloc-ed buffer (inside the data segment of mount)
  • send a TA_ALLOC_SEG message to task in order to allocate a separate segment and copy driver there
  • build a DEFINE_DRIVER message using data from driver file and the allocated segment address and send the message to fsys (part of kernel, but separate task)
  • send a SET_ATTR message to fsys that has the side-effect of initializing the driver
  • use the driver to read first HDD sector (partition table)
  • send another SET_ATTR message to adjust disk size and offset to values read from partition

Knowing this gave me an idea to what my loader would need to do beside simply loading the kernel from HDD. However, this still depended on having an already running kernel to send messages to.

Back to kernel

The kernel is split into 5 parts:

  • task (task and memory management)
  • fsys (disk and filesystem)
  • dev (terminal devices)
  • idle (CPU arbiter)
  • shared (int 72 handler, mostly libc and other shared functions)

Description in parenthesis are my assumptions.

The copy protection routine (tries to read the last sector from diskette and if the read succeeds resets the computer) provides a good entry-point into the fsys part of the kernel. I assumed it can be replaced with some code to emulate what mount does. However, trying to allocate a segment (via TA_ALLOC_SEG message) hangs. I think this is causing a deadlock, since fsys initialization is called from task before it finished its initialization. Fortunatelly, while digging into this I noticed the header structure of the kernel, thus enabling me to increase its size in order to fit the xt driver at the end of fsys (it would have been slightly easier to put it at the end of shared, but that didn’t occur to me at the time).

Failing to use syscalls (DEFINE_DRIVER and SET_ATTR) meant I had to determine what those messages actually did. I disassembled fsys separately and proceeded to manually follow the code path in order to determine the effect each of those messages should have in the context of mounting a disk. Eventually it emerged that almost all of the data structures can be prefilled in the kernel image, leaving only the call to driver initializaion function.

I modified the kernel to add the xt driver at the end of fsys (modifying the header by hand), replaced the copy-protection routine with code to call its initialization, and indeed the harddrive was available from the start, without the need to run mount. I was still booting from diskette at this time but I was past the most difficult hurdle.

Finishing touches

Loading the kernel proved somewhat simple (I still have some knowledge about 16-bit assembly and real-mode BIOS) but the kernel “insisted” in trying to run /cmds/sh from floppy. At first I solved this by an ugly hack, modifying the command line string in kernel image from “/cmds/sh” to “3:/xi/sh” and “/config/sys.init” to “3:/xi/sys.init” (3: being the HDD identifier, similar to C: from DOS). The xi was needed in order to keep the same string length, or at least not making it larger since there was some other importand data just past this.

This mostly solved the challenge (there were some other minor mistakes and fixes), except I disliked that hack and went on to analyzing that first start of /cmds/sh, disassembling fopen (in shared) and finally finding the memory location where of the search system variable (somewhat similar to PATH). Modifying that variable eliminated the need for starting the first shell with “3:”.

Room for improvement

At present some parameters are hardcoded and the kernel is just placed at the end of the HDD, outside of QNX parition and its position and size is written in the boot sector (somewhat similar to the original QNX diskette approach). The partition size itself is hardcoded (by hand) in the kernel data structures instead of being read from the partition table on boot. Still, for something that is unlikely to ever run outside an emulator, I deem it good enough (for now).


  • to Zir Blazer for putting a lot of effort into his approach and documenting each step
  • to Mitchell Schoenbrun for providing insight into QNX system philosophy
  • to forty for beating the first challenge and identifying the copy-protection routine address
  • and of course, to Tenox, Neozeed and Dan Dodge for the challenge. And for providing me with a great prize for 3 weeks of hard-working fun!

To access files, tools, bootable image and ready to run in your browser PCjs with QNX 1.2 go to Mihai’s site post.

Virtualization Challenge IV – Act II – QNX 1.2 HDD Boot ($2000 prize)

(This is a guest post by Antoni Sawicki aka Tenox)

A couple of months ago we hosted VIrtualization Challenge for QNX v1.2. I expected that the hard part would be to circumvent the copy protection and the rest would be easy. It turned out to be quite the opposite! The copy protection was worked around in no time by several people independently. What turned out to be impossible is to install the OS on a hard disk.

QNX 1.2 does have several drivers for different hdd controllers including BIOS mode. It has fdisk, can create partitions, install MBR, format fs, mount hard disk volumes… but it cannot install boot code. Apparently this functionality has been added only in QNX 2.x. After a long debate we settled for a solution where you boot kernel from a floppy disk and use the rest of the os from a hard disk. This was implemented by Forty who won the challenge which was outlined in this post.

In a rather unexpected turn of events Dan Dodge, co-creator and CEO of QNX Software Systems himself reached out to us and offered to extend the contest to finish the process properly. Dan is offering $2000 prize for making QNX 1.2 boot from hard disk without use of the boot floppy disk. I have confirmed the details in an email exchange.

Rules: As always the winner will be the first person who provides a working image in the comments. Any emulator/hypervisor is allowed. You can use boot loader from QNX 2.x, or write your own or anything else you come up with. There are some tips in Dan’s comment. Ask away for more details. The QNX repository is here. Good luck! 🙂

Update: The challenge has been completed! The winner is Mihai Gaitos and this is the winning entry. I will work with Mihai to get a more detailed blog post of what has been done and Dan to hand out the $2000 prize. Congratulations!!!

Update: You can try QNX 1.2 straight in your browser with PCjs!

Internet archive to fully open up during “Covid19/SARS 2/武汉肺炎”

To address our unprecedented global and immediate need for access to reading and research materials, as of today, March 24, 2020, the Internet Archive will suspend waitlists for the 1.4 million (and growing) books in our lending library by creating a National Emergency Library to serve the nation’s displaced learners. This suspension will run through June 30, 2020, or the end of the US national emergency, whichever is later. 

During the waitlist suspension, users will be able to borrow books from the National Emergency Library without joining a waitlist, ensuring that students will have access to assigned readings and library materials that the Internet Archive has digitized for the remainder of the US academic calendar, and that people who cannot physically access their local libraries because of closure or self-quarantine can continue to read and thrive during this time of crisis, keeping themselves and others safe.  

This library brings together all the books from Phillips Academy Andover and Marygrove College, and much of Trent University’s collections, along with over a million other books donated from other libraries to readers worldwide that are locked out of their libraries.

This is a response to the scores of inquiries from educators about the capacity of our lending system and the scale needed to meet classroom demands because of the closures. Working with librarians in Boston area, led by Tom Blake of Boston Public Library, who gathered course reserves and reading lists from college and school libraries, we determined which of those books the Internet Archive had already digitized.  Through that work we quickly realized that our lending library wasn’t going to scale to meet the needs of a global community of displaced learners. To make a real difference for the nation and the world, we would have to take a bigger step.

“The library system, because of our national emergency, is coming to aid those that are forced to learn at home, ” said Brewster Kahle, Digital Librarian of the Internet Archive. “This was our dream for the original Internet coming to life: the Library at everyone’s fingertips.”

Public support for this emergency measure has come from over 100 individuals, libraries and universities across the world, including the Massachusetts Institute of Technology (MIT).  “Ubiquitous access to open digital content has long been an important goal for MIT and MIT Libraries. Learning and research depend on it,” said Chris Bourg, Director of MIT Libraries. “In a global pandemic, robust digital lending options are key to a library’s ability to care for staff and the community, by allowing all of us to work remotely and maintain the recommended social distancing.”

We understand that we’re not going to be able to meet everyone’s needs; our collection, at 1.4 million modern books, is a fraction of the size of a large metropolitan library system or a great academic library. The books that we’ve digitized have been acquired with a focus on materials published during the 20th century, the vast majority of which do not have a commercially available ebook.  This means that while readers and students are able to access latest best sellers and popular titles through services like OverDrive and Hoopla, they don’t have access to the books that only exist in paper, sitting inaccessible on their library shelves. That’s where our collection fits in—we offer digital access to books, many of which are otherwise unavailable to the public while our schools and libraries are closed. In addition to the National Emergency Library, the Internet Archive also offers free public access to 2.5 million fully downloadable public domain books, which do not require waitlists to view.

We recognize that authors and publishers are going to be impacted by this global pandemic as well. We encourage all readers who are in a position to buy books to do so, ideally while also supporting your local bookstore. If they don’t have the book you need, then Amazon or Better World Books may have copies in print or digital formats. We hope that authors will support our effort to ensure temporary access to their work in this time of crisis. We are empowering authors to explicitly opt in and donate books to the National Emergency Library if we don’t have a copy. We are also making it easy for authors to contact us to take a book out of the library. Learn more in our FAQ.

A final note on calling this a “National Emergency” Library.  We lend to the world, including these books. We chose that language deliberately because we are pegging the suspension of the waitlists to the duration of the US national emergency.  Users all over the world have equal access to the books now available, regardless of their location.

How you can help:

  1. Read books, recommend books, and teach using books from the National Emergency Library
  2. Sponsor a book to be digitized and preserved
  3. Endorse this effort institutionally or individually
  4. Share news about the National Emergency Library with your social media followers using #NationalEmergencyLibrary
  5. Donate to the Internet Archive

If you have additional questions, please check out our FAQ or contact Chris Freeland, Director of Open Libraries

Random fun with video cards

So that 3d Mark 11 bench tool was on sale, and I thought it’d be interesting to judge various machines.

NVIDIA GeForce RTX 2070(1x) and Intel Core i7-9700K ProcessorNVIDIA GeForce GTX 980(1x) and Intel Xeon Processor E5-2678 v3AMD Radeon R9 290X(1x) and Intel Xeon Processor E5-2620 v2
3DMark Score239801635113534
Graphics Score290581758518872
Physics Score15672132897876
Combined Score15824138556623
Graphics Test 1 fps1146.83182.54180.92
Graphics Test 2 fps2145.84282.38290.27
Graphics Test 3 fps3173.273111.73124.86
Graphics Test 4 fps481.82452.33457.79
Physics Test (fps)49.7642.1925
Combined Test (fps)73.664.4430.81

One glaring thing is that the old AMD (new ones too??) don’t have any PhysX acceleration so the weak processors shine through. And honestly the 980 is still a really solid card. Assuming yours hasn’t been mined to death.

Going from memory here is roughly what I paid (Yes I bought the RTX 2070 before the announcement of the Supers, and basically it’s too high for right now, but this is what happens in tech, value slides way down).

RTX 2070GTX 980R9 290x
price in HKD$3,739$800$350
3DMark Score239801635113534
3DMark per HKD6.4120.4438.67

So the biggest bang for the buck is the used stuff. Like it’s not even close. I should probably add in MSRP’s for the old cards. But here we are.

My old GPU was an GTX 460, before I tried out the 1030 & 1050 before making the leap to the 2070. But lately I’ve been looking for old gen cards as they seem to perform pretty well. Also for AutoCad I picked up a P2000. It’s insane how much those cards can go for, and I should bench that one to show how terrible it is at gaming. But wow what a speed difference in CAD.

Happy 2020!

Or 1992, for those of us living in an offset in the past.

Somehow this year, I’ve managed to do 119 posts, 1 more than last year. Although a far cry from 2011’s 254 posts. I still have things to write about, it’s just a hard time either making them interesting or trying to see them through to some kind of conclusion other than ‘here is a thing’. type thing. thing.

I’m in Sapporo for the new years, and as this was such a last minute plan, I eneded up just grabbing two bottles of this ‘Niagara’ wine… Proudly made in Hokkadio It’s not the worse champaigne Ive ever had, but it’s too… “grapeie” for my likes. But that is what you get for walking in at 1030pm into a 7-11.

I should make a post or video thing about traveling. I run into so many people that always say they are so ‘envious’ of my travels, and how do I plan and cope… And the truth is that I don’t plan, I just show up, I don’t take all that much with me, as we live in the future now, Amazon is a thing, just as everywhere you can order stuff off the internet or buy mass produced goods anywhere. The only real important things are an Android phone for google translate, and data SIMs for wherever you are going. Places like Japan will have them in the airport, and it’s just best to buy them there.

I know that buying things in the airport is usually the worst, or most expensive option, but in many nations like Japan, where non residents are barred from having a phone number or buying any kind of phone or service, your only real option is the airport, although some of the big chain tourist / tax free chain stores (Sofmap/BIC Camera) are starting to carry travel SIMs, although they are more expensive than the offerings in the airport.

Also don’t panic, relax. If you need ‘things’ check out the exciting world of thrift stores. Go use a coin laundromat. Use google translate to read a local news paper, or gossip rag. Why live in a tower of absolute luxury when you can experience a few seconds amongst the people?

And select some cloud service, and setup your phone and camera to backup to the cloud, just embrace it. If you have your stuff lost/stolen you’ll be devastated to have lost everything, instead of only since your last checkpoint. Rent servers out in the internet. If I lose my phone or laptop, all my data and access is ‘out there’ so I just buy a new machine, remote wipe the old, and move on with my life. Heck I don’t even need to travel with such things, and I can just buy discount/burner cheap models on arriving letting me later gift the machines to other people, and or kids. Not to mention like my rescued i7 makes for an interesting tale at least.

For those with boring office jobs, encourage people to use collaboration tools like Sococo and open people’s eyes to that there “is no here” anymore. And that the sad reality of being constantly connected, and constantly available also means you don’t need to be in the office, everywhere you can get internet is your office.

It’s a small world afterall.

Speaking of citations!

Quantum supremacy using a programmable superconducting processor Google AI Quantum and collaboratorsy The tantalizing promise of quantum computers is that certain computational tasks might be executed exponentially faster on a quantum processor than on a classical processor. A fundamental challenge is to build a high- delity processor capable of running quantum algorithms in an exponentially large computational space.

Here, we report using a processor with programmable superconducting qubits to create quantum states on 53 qubits, occupying a state space 253 ˘1016. Measurements from repeated experiments sample the corresponding probability distribution, which we verify using classical simulations. While our processor takes about 200 seconds to sample one instance of the quantum circuit 1 million times, a state-of-the-art supercomputer would require approximately 10,000 years to perform the equivalent task.

This dramatic speedup relative to all known classical algorithms provides an experimental realization of quantum supremacy on a computational task and heralds the advent of a much-anticipated computing paradigm. In the early 1980s, Richard Feynman proposed that a quantum computer would be an effective tool to solve problems in physics and chemistry, as it is exponentially costly to simulate large quantum systems with classical computers [1]. Realizing Feynman’s vision poses significant experimental and theoretical challenges. First, can a quantum system be engineered to perform a computation in a large enough computational (Hilbert) space and with low enough errors to provide a quantum speedup? Second, can we formulate a problem that is hard for a classical computer but easy for a quantum computer? By computing a novel benchmark task on our superconducting qubit processor[2{7], we tackle both questions. Our experiment marks a milestone towards full scale quantum computing: quantum supremacy[8].

In reaching this milestone, we show that quantum speedup is achievable in a real-world system and is not precluded by any hidden physical laws. Quantum supremacy also heralds the era of Noisy Intermediate- Scale Quantum (NISQ) technologies. The benchmark task we demonstrate has an immediate application in generating certifiable random numbers[9]; other initial uses for this new computational capability may include optimization optimization [10{12], machine learning[13{ 15], materials science and chemistry [16{18]. However, realizing the full promise of quantum computing (e.g. Shor’s algorithm for factoring) still requires technical leaps to engineer fault-tolerant logical qubits[19{23].

To achieve quantum supremacy, we made a number of technical advances which also pave the way towards error correction. We developed fast, high delity gates that can be executed simultaneously across a two-dimensional qubit array. We calibrated and bench-marked the processor at both the component and system level using a powerful new tool: cross-entropy bench-marking (XEB).

Finally, we used component-level delities to accurately predict the performance of the whole system, further showing that quantum information behaves as expected when scaling to large systems. A COMPUTATIONAL TASK TO DEMONSTRATE QUANTUM SUPREMACY To demonstrate quantum supremacy, we compare our quantum processor against state-of-the-art classical computers in the task of sampling the output of a pseudo- random quantum circuit[24{26].

Random circuits are a suitable choice for bench-marking since they do not possess structure and therefore allow for limited guarantees of computational hardness[24, 25, 27, 28]. We design the circuits to entangle a set of quantum bits (qubits) by re- peated application of single-qubit and two-qubit logical operations. Sampling the quantum circuit’s output produces a set of bitstrings, e.g. f0000101, 1011100, …g. Due to quantum interference, the probability distribution of the bitstrings resembles a speckled intensity pattern produced by light interference in laser scatter, such that some bitstrings are much more likely to occur than others. Classically computing this probability distribution becomes exponentially more difficult as the number of qubits (width) and number of gate cycles (depth) grows. We verify that the quantum processor is working properly using a method called cross-entropy bench-marking (XEB) [24, 26], which compares how often each bitstring is observed experimentally with its corresponding ideal probability computed via simulation on a classical com- puter. For a given circuit, we collect the measured bit- strings fxigand compute the linear XEB delity [24{ 26, 29], which is the mean of the simulated probabilities of the bitstrings we measured: F XEB = 2 nhP(x i)i i 1 (1) where nis the number of qubits, P(x i) is the probability of bitstring x i computed for the ideal quantum circuit, and the average is over the observed bitstrings. Intuitively, F XEB is correlated with how often we sample high probability bitstrings. When there are no errors in the quantum circuit, sampling the probability distribution will produce F XEB = 1. On the other hand, sampling from the uniform distribution will give hP(x i)i i = 1=2n and produce F XEB = 0. Values of F XEB between 0 and 2 Qubit Adjustable coupler a b 10 millimeters FIG. 1. The Sycamore processor. a, Layout of processor showing a rectangular array of 54 qubits (gray), each connected to its four nearest neighbors with couplers (blue). In- operable qubit is outlined. b, Optical image of the Sycamore chip. 1 correspond to the probability that no error has oc- curred while running the circuit.

The probabilities P(x i) must be obtained from classically simulating the quan- tum circuit, and thus computing F XEB is intractable in the regime of quantum supremacy. However, with certain circuit simplications, we can obtain quantitative delity estimates of a fully operating processor running wide and deep quantum circuits. Our goal is to achieve a high enough F XEB for a circuit with sufficient width and depth such that the classical computing cost is prohibitively large. This is a difficult task because our logic gates are imperfect and the quantum states we intend to create are sensitive to errors. A single bit or phase ip over the course of the algorithm will completely shuffle the speckle pattern and result in close to 0 delity [24, 29].

Therefore, in order to claim quantum supremacy we need a quantum processor that executes the program with sufficiently low error rates. BUILDING AND CHARACTERIZING A HIGH-FIDELITY PROCESSOR We designed a quantum processor named \Sycamore” which consists of a two-dimensional array of 54 trans-mon qubits, where each qubit is tunably coupled to four nearest-neighbors, in a rectangular lattice. The connectivity was chosen to be forward compatible with error- correction using the surface code [20].

A key systems- engineering advance of this device is achieving high- delity single- and two-qubit operations, not just in isolation but also while performing a realistic computation with simultaneous gate operations on many qubits. We discuss the highlights below; extended details can be found in the supplementary information. In a superconducting circuit, conduction electrons con- dense into a macroscopic quantum state, such that cur- rents and voltages behave quantum mechanically [2, 30]. Our processor uses transmon qubits [6], which can be thought of as nonlinear superconducting resonators at 5 to 7 GHz.

The qubit is encoded as the two lowest quantum eigenstates of the resonant circuit. Each transmon has two controls: a microwave drive to excite the qubit, and a magnetic ux control to tune the frequency. Each qubit is connected to a linear resonator used to read out the qubit state [5]. As shown in Fig. 1, each qubit is also connected to its neighboring qubits using a new adjustable coupler [31, 32]. Our coupler design allows us to quickly tune the qubit-qubit coupling from completely o to 40 MHz. Since one qubit did not function properly the device uses 53 qubits and 86 couplers. The processor is fabricated using aluminum for metalization and Josephson junctions, and indium for bump- bonds between two silicon wafers. The chip is wire- bonded to a superconducting circuit board and cooled to below 20 mK in a dilution refrigerator to reduce ambient thermal energy to well below the qubit energy. The processor is connected through filters and attenuators to room-temperature electronics, which synthesize the control signals.

The state of all qubits can be read simultaneously by using a frequency-multiplexing tech- nique[33, 34]. We use two stages of cryogenic amplifiers to boost the signal, which is digitized (8 bits at 1 GS/s) and demultiplexed digitally at room temperature. In total, we orchestrate 277 digital-to-analog converters (14 bits at 1 GS/s) for complete control of the quantum processor. We execute single-qubit gates by driving 25 ns microwave pulses resonant with the qubit frequency while the qubit-qubit coupling is turned off.

The pulses are shaped to minimize transitions to higher transmon states[35]. Gate performance varies strongly with frequency due to two-level-system (TLS) defects[36, 37], stray microwave modes, coupling to control lines and the readout resonator, residual stray coupling between qubits, ux noise, and pulse distortions. We therefore 3 Pauli and measurement errors CDF am, E ted histogr Integra e 1 e 2 e 2c e r a b Average error Single-qubit (e 1 ) Two-qubit (e 2 ) Two-qubit, cycle (e 2c ) Readout (e r ) Isolated 0.15% 0.36% 0.65% 3.1% Simultaneous 0.16% 0.62% 0.93% 3.8% Simultaneous Pauli error e 1 , e 2 10 -2 10 -3 Isolated FIG. 2. System-wide Pauli and measurement errors. a, Integrated histogram (empirical cumulative distribution function, ECDF) of Pauli errors (black, green, blue) and readout errors (orange), measured on qubits in isolation (dotted lines) and when operating all qubits simultaneously (solid). The median of each distribution occurs at 0.50 on the vertical axis. Average (mean) values are shown below. b, Heatmap showing single- and two-qubit Pauli errors e 1 (crosses) and e 2 (bars) positioned in the layout of the processor. Values shown for all qubits operating simultaneously. optimize the single-qubit operation frequencies to miti- gate these error mechanisms. We benchmark single-qubit gate performance by using the XEB protocol described above, reduced to the single- qubit level (n= 1), to measure the probability of an error occurring during a single-qubit gate. On each qubit, we apply a variable number mof randomly selected gates and measure F XEB averaged over many sequences; as m increases, errors accumulate and average F XEB decays.

We model this decay by [1 e 1=(1 1=D2)]m where e 1 is the Pauli error probability. The state (Hilbert) space dimension term, D= 2n = 2, corrects for the depolarizing model where states with errors partially overlap with the ideal state. This procedure is similar to the more typical technique of randomized bench-marking [21, 38, 39], but supports non-Cli ord gatesets [40] and can separate out decoherence error from coherent control error. We then repeat the experiment with all qubits executing single- qubit gates simultaneously (Fig.2), which shows only a small increase in the error probabilities, demonstrating that our device has low microwave crosstalk. We perform two-qubit iSWAP-like entangling gates by bringing neighboring qubits on resonance and turning on a 20 MHz coupling for 12 ns, which allows the qubits to swap excitations. During this time, the qubits also ex- perience a controlled-phase (CZ) interaction, which originates from the higher levels of the transmon. The two- qubit gate frequency trajectories of each pair of qubits are optimized to mitigate the same error mechanisms considered in optimizing single-qubit operation frequencies. To characterize and benchmark the two-qubit gates, we run two-qubit circuits with mcycles, where each cy- cle contains a randomly chosen single-qubit gate on each of the two qubits followed by a xed two-qubit gate. We learn the parameters of the two-qubit unitary (e.g. the amount of iSWAP and CZ interaction) by using F XEB as a cost function. After this optimization, we extract the per-cycle error e 2c from the decay of F XEB with m, and isolate the two-qubit error e 2 by subtracting the two single-qubit errors e 1. We found an average e 2 of 0:36%.

Additionally, we repeat the same procedure while simultaneously running two-qubit circuits for the entire array. After updating the unitary parameters to account for effects such as dispersive shifts and crosstalk, we nd an average e 2 of 0.62%. For the full experiment, we generate quantum circuits using the two-qubit unitaries measured for each pair during simultaneous operation, rather than a standard gate for all pairs. The typical two-qubit gate is a full iSWAP with 1=6 of a full CZ. In principle, our architecture could generate unitaries with arbitrary iSWAP and CZ inter- actions, but reliably generating a target unitary remains an active area of research. Finally, we benchmark qubit readout using standard dispersive measurement [41]. Measurement errors aver- aged over the 0 and 1 states are shown in Fig 2a.

We have also measured the error when operating all qubits simultaneously, by randomly preparing each qubit in the 0 or 1 state and then measuring all qubits for the probability of the correct result. We nd that simultaneous readout incurs only a modest increase in per-qubit measurement errors. Having found the error rates of the individual gates and readout, we can model the delity of a quantum circuit as the product of the probabilities of error-free opera- 4 single-qubit gate: 25 ns qubit XY control two-qubit gate: 12 ns qubit 1 Z control qubit 2 Z control coupler cycle: 1 2 3 4 5 6 m time column row 7 8 A BC D A B D C a b FIG. 3. Control operations for the quantum supremacy circuits. a, Example quantum circuit instance used in our experiment. Every cycle includes a layer each of single- and two-qubit gates. The single-qubit gates are chosen randomly from f p X; p Y; p Wg. The sequence of two-qubit gates are chosen according to a tiling pattern, coupling each qubit sequentially to its four nearest-neighbor qubits.

The couplers are divided into four subsets (ABCD), each of which is executed simultaneously across the entire array corresponding to shaded colors. Here we show an intractable sequence (repeat ABCDCDAB); we also use different coupler subsets along with a simplifiable sequence (repeat EFGHEFGH, not shown) that can be simulated on a classical computer. b, Waveform of control signals for single- and two-qubit gates. tion of all gates and measurements. Our largest random quantum circuits have 53 qubits, 1113 single-qubit gates, 430 two-qubit gates, and a measurement on each qubit, for which we predict a total delity of 0:2%.

This delity should be resolvable with a few million measurements, since the uncertainty on F XEB is 1= p N s, where N s is the number of samples. Our model assumes that entangling larger and larger systems does not introduce additional error sources beyond the errors we measure at the single- and two-qubit level | in the next section we will see how well this hypothesis holds. FIDELITY ESTIMATION IN THE SUPREMACY REGIME The gate sequence for our pseudo-random quantum circuit generation is shown in Fig.3. One cycle of the algorithm consists of applying single-qubit gates chosen randomly from f p X; p Y; p Wgon all qubits, followed by two-qubit gates on pairs of qubits. The sequences of gates which form the \supremacy circuits” are designed to minimize the circuit depth required to create a highly entangled state, which ensures computational complexity and classical hardness. While we cannot compute F XEB in the supremacy regime, we can estimate it using three variations to reduce the complexity of the circuits.

In \patch circuits”, we remove a slice of two-qubit gates (a small fraction of the total number of two-qubit gates), splitting the circuit into two spatially isolated, non-interacting patches of qubits. We then compute the total delity as the product of the patch delities, each of which can be easily calculated. In \elided circuits”, we remove only a fraction of the initial two-qubit gates along the slice, allowing for entanglement between patches, which more closely mimics the full experiment while still maintaining simulation feasibility. Finally, we can also run full \veri cation cir- cuits” with the same gate counts as our supremacy circuits, but with a different pattern for the sequence of two- qubit gates which is much easier to simulate classically [29]. Comparison between these variations allows tracking of the system delity as we approach the supremacy regime. We rst check that the patch and elided versions of the verification circuits produce the same delity as the full verification circuits up to 53 qubits, as shown in Fig.4a. For each data point, we typically collect N s = 5 106 total samples over ten circuit instances, where instances differ only in the choices of single-qubit gates in each cycle.

We also show predicted F XEB values computed by multiplying the no-error probabilities of single- and two-qubit gates and measurement [29]. Patch, elided, and predicted delities all show good agreement with the delities of the corresponding full circuits, despite the vast differences in computational complexity and en- tanglement. This gives us confidence that elided circuits can be used to accurately estimate the delity of more complex circuits. We proceed now to benchmark our most computationally difficult circuits. In Fig.4b, we show the measured F XEB for 53-qubit patch and elided versions of the full supremacy circuits with increasing depth.

For the largest circuit with 53 qubits and 20 cycles, we collected N s = 30 106 samples over 10 circuit instances, obtaining F XEB = (2:24 0:21) 10 3 for the elided circuits. With 5˙con dence, we assert that the average delity of running these circuits on the quantum processor is greater than at least 0.1%. The full data for Fig.4b should have similar delities, but are only archived since the simulation times (red numbers) take too long. It is thus in the quantum supremacy regime. 5 number of qubits, n number of cycles, m n = 53 qubits a Classically verifiable b Supremacy regime idelity, XEB F

XEB m = 14 cycles Prediction from gate and measurement errors Full circuit Elided circuit Patch circuit Prediction Patch E F G H A B C D C D A B Elided (±5 error bars) 10 millennia 100 years 600 years 4 years 4 years 2 weeks 1 week 2 hour sC la ic mp ng @ Sycamore 5 hours Classical verification Sycamore sampling (N s = 1M): 200 seconds 10 15 20 25 30 35 40 45 50 55 12 14 16 18 20 10 -3 10 -2 10 -1 10 0 FIG. 4. Demonstrating quantum supremacy. a, Verification of bench-marking methods. F XEB values for patch, elided, and full verification circuits are calculated from measured bitstrings and the corresponding probabilities predicted by classical simulation. Here, the two-qubit gates are applied in a simplifiable tiling and sequence such that the full circuits can be simulated out to n= 53;m= 14 in a reasonable amount of time. Each data point is an average over 10 distinct quantum circuit instances that differ in their single-qubit gates (for n= 39;42;43 only 2 instances were simulated). For each n, each instance is sampled with N s between 0:5M and 2:5M.

The black line shows predicted F XEB based on single- and two-qubit gate and measurement errors. The close correspondence between all four curves, despite their vast differences in complexity, justifies the use of elided circuits to estimate delity in the supremacy regime. b, Estimating F XEB in the quantum supremacy regime. Here, the two-qubit gates are applied in a non-simplifiable tiling and sequence for which it is much harder to simulate. For the largest elided data (n= 53, m= 20, total N s = 30M), we nd an average F XEB >0.1% with 5˙confidence, where ˙includes both systematic and statistical uncertainties.

The corresponding full circuit data, not simulated but archived, is expected to show similarly significant delity. For m= 20, obtaining 1M samples on the quantum processor takes 200 seconds, while an equal delity classical sampling would take 10,000 years on 1M cores, and verifying the delity would take millions of years. DETERMINING THE CLASSICAL COMPUTATIONAL COST We simulate the quantum circuits used in the experiment on classical computers for two purposes: verifying our quantum processor and bench-marking methods by computing F XEB where possible using simplifiable circuits (Fig.4a), and estimating F XEB as well as the classical cost of sampling our hardest circuits (Fig.4b).

Up to 43 qubits, we use a Schrodinger algorithm (SA) which simulates the evolution of the full quantum state; the Julich supercomputer(100k cores, 250TB) runs the largest cases. Above this size, there is not enough RAM to store the quantum state [42]. For larger qubit numbers, we use a hybrid Schrodinger-Feynman algorithm (SFA)[43] running on Google data centers to compute the amplitudes of individual bitstrings. This algorithm breaks the circuit up into two patches of qubits and efficiently simulates each patch using a Schrodinger method, before connecting them using an approach reminiscent of the Feynman path-integral.

While it is more memory- ecient, SFA becomes exponentially more computation- ally expensive with increasing circuit depth due to the exponential growth of paths with the number of gates connecting the patches. To estimate the classical computational cost of the supremacy circuits (gray numbers, Fig.4b), we ran portions of the quantum circuit simulation on both the Sum- mit supercomputer as well as on Google clusters and extrapolated to the full cost. In this extrapolation, we account for the computational cost scaling with F XEB, e.g. the 0.1% delity decreases the cost by 1000[43, 44]. On the Summit supercomputer, which is currently the most powerful in the world, we used a method inspired by Feynman path-integrals that is most efficient at low depth[44{47].

At m= 20 the tensors do not reasonably t in node memory, so we can only measure runtimes up to m= 14, for which we estimate that sampling 3M bitstrings with 1% delity would require 1 year. 6 On Google Cloud servers, we estimate that perform- ing the same task for m= 20 with 0:1% delity using the SFA algorithm would cost 50 trillion core-hours and consume 1 petawatt hour of energy. To put this in per- spective, it took 600 seconds to sample the circuit on the quantum processor 3 million times, where sampling time is limited by control hardware communications; in fact, the net quantum processor time is only about 30 seconds. The bitstring samples from this largest circuit are archived online. One may wonder to what extent algorithmic innovation can enhance classical simulations. Our assumption, based on insights from complexity theory, is that the cost of this algorithmic task is exponential in nas well as m. Indeed, simulation methods have improved steadily over the past few years[42{50].

We expect that lower simulation costs than reported here will eventually be achieved, but we also expect they will be consistently outpaced by hardware improvements on larger quantum processors. VERIFYING THE DIGITAL ERROR MODEL A key assumption underlying the theory of quantum error correction is that quantum state errors may be considered digitized and localized [38, 51]. Under such a dig- ital model, all errors in the evolving quantum state may be characterized by a set of localized Pauli errors (bit and/or phase ips) interspersed into the circuit. Since continuous amplitudes are fundamental to quantum mechanics, it needs to be tested whether errors in a quantum system could be treated as discrete and probabilistic. In- deed, our experimental observations support the validity of this model for our processor. Our system delity is well predicted by a simple model in which the individually characterized delities of each gate are multiplied together (Fig

4). To be successfully described by a digitized error model, a system should be low in correlated errors. We achieve this in our experiment by choosing circuits that randomize and decorrelate errors, by optimizing control to minimize systematic errors and leakage, and by design- ing gates that operate much faster than correlated noise sources, such as 1=f ux noise [37]. Demonstrating a pre- dictive uncorrelated error model up to a Hilbert space of size 253 shows that we can build a system where quantum resources, such as entanglement, are not prohibitively fragile. WHAT DOES THE FUTURE HOLD?

Quantum processors based on superconducting qubits can now perform computations in a Hilbert space of dimension 253 ˇ9 1015, beyond the reach of the fastest classical supercomputers available today. To our knowledge, this experiment marks the rst computation that can only be performed on a quantum processor. Quantum processors have thus reached the regime of quantum supremacy. We expect their computational power will continue to grow at a double exponential rate: the classical cost of simulating a quantum circuit increases exponentially with computational volume, and hardware improvements will likely follow a quantum-processor equivalent of Moore’s law [52, 53], doubling this computational volume every few years. To sustain the double exponential growth rate and to eventually o er the computational volume needed to run well-known quantum algorithms, such as the Shor or Grover algorithms [19, 54], the engineering of quantum error correction will have to become a focus of attention. The \Extended Church-Turing Thesis” formulated by Bernstein and Vazirani [55] asserts that any \reasonable” model of computation can be efficiently simulated by a Turing machine. Our experiment suggests that a model of computation may now be available that violates this assertion.

We have performed random quantum circuit sampling in polynomial time with a physically realized quantum processor (with sufficiently low error rates), yet no efficient method is known to exist for classical computing machinery. As a result of these developments, quantum computing is transitioning from a research topic to a technology that unlocks new computational capabilities. We are only one creative algorithm away from valuable near-term applications. Acknowledgments We are grateful to Eric Schmidt, Sergey Brin, Je Dean, and Jay Yagnik for their executive sponsorship of the Google AI Quantum team, and for their continued engagement and support. We thank Peter Norvig for reviewing a draft of the manuscript, and Sergey Knysh for useful discussions.

We thank Kevin Kissel, Joey Raso, Davinci Yonge-Mallo, Orion Martin, and Niranjan Sridhar for their help with simulations. We thank Gina Bortoli and Lily Laws for keeping our team organized. This research used resources from the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

A portion of this work was performed in the UCSB Nanofabrication Facility, an open access laboratory. Author contributions The Google AI Quantum team conceived of the experiment. The applications and algorithms team provided the theoretical foundation and the specifics of the algorithm. The hardware team carried out the experiment and collected the data. The data analysis was done jointly with outside collaborators.


[1]Feynman, R. P. Simulating physics with computers. Int. J. Theor. Phys. 21, 467{488 (1982).

[2]Devoret, M. H., Martinis, J. M. & Clarke, J. Mea- surements of macroscopic quantum tunneling out of the zero-voltage state of a current-biased josephson junction. Phys. Rev. Lett 55, 1908 (1985).

[3]Nakamura, Y., Chen, C. D. & Tsai, J. S. Spectroscopy of energy-level splitting between two macroscopic quan- tum states of charge coherently superposed by josephson coupling. Phys. Rev. Lett. 79, 2328 (1997).

[4]Mooij, J. et al. Josephson persistent-current qubit. Sci- ence 285, 1036 (1999).

[5]Wallra , A. et al. Strong coupling of a single photon to a superconducting qubit using circuit quantum electrody- namics. Nature 431, 162 (2004).

[6]Koch, J. et al. Charge-insensitive qubit design derived from the cooper pair box. Phys. Rev. A 76, 042319 (2007).

[7]You, J. Q. & Nori, F. Atomic physics and quantum optics using superconducting circuits. Nature 474, 589 (2011).

[8]Preskill, J. Quantum computing and the entanglement frontier. Rapporteur talk at the 25th Solvay Conference on Physics, Brussels (2012).

[9]Aaronson, S. Certi ed randomness from quantum supremacy. In preparation .

[10]Hastings, M. B. Classical and Quantum Bounded Depth Approximation Algorithms. arXiv e-prints arXiv:1905.07047 (2019). 1905.07047.

[11]Kechedzhi, K. et al. Ecient population transfer via non- ergodic extended states in quantum spin glass. arXiv e-prints arXiv:1807.04792 (2018). 1807.04792.

[12]Somma, R. D., Boixo, S., Barnum, H. & Knill, E. Quan- tum simulations of classical annealing processes. Phys. Rev. Lett. letters 101, 130504 (2008).

[13]McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush, R. & Neven, H. Barren plateaus in quantum neural net- work training landscapes. Nat. Comm. 9, 4812 (2018).

[14]Cong, I., Choi, S. & Lukin, M. D. Quantum convolutional neural networks. arXiv:1810.03787 (2018).

[15]Bravyi, S., Gosset, D. & Konig, R. Quantum advantage with shallow circuits. Science 362, 308{311 (2018).

[16]Aspuru-Guzik, A., Dutoi, A. D., Love, P. J. & Head- Gordon, M. Simulated quantum computation of molecu- lar energies. Science 309, 1704{1707 (2005).

[17]Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 4213 (2014). [18]Hempel, C. et al. Quantum chemistry calculations on a trapped-ion quantum simulator. Phys. Rev. X 8, 031022 (2018).

[19]Shor, P. W. Algorithms for quantum computation: dis- crete logarithms and factoring proceedings. Proceedings 35th Annual Symposium on Foundations of Computer Science (1994).

[20]Fowler, A. G., Mariantoni, M., Martinis, J. M. & Cle- land, A. N. Surface codes: Towards practical large-scale quantum computation. Phys. Rev. A 86, 032324 (2012).

[21]Barends, R. et al. Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature 508, 500{503 (2014).

[22]Corcoles, A. D. et al. Demonstration of a quantum error detection code using a square lattice of four supercon- ducting qubits. Nat. Commun. 6, 6979 (2015).

[23]Ofek, N. et al. Extending the lifetime of a quantum bit with error correction in superconducting circuits. Nature 536, 441 (2016).

[24]Boixo, S. et al. Characterizing quantum supremacy in near-term devices. Nat. Phys. 14, 595 (2018).

[25]Aaronson, S. & Chen, L. Complexity-theoretic founda- tions of quantum supremacy experiments. In 32nd Com- putational Complexity Conference (CCC 2017) (2017).

[26]Neill, C. et al. A blueprint for demonstrating quantum supremacy with superconducting qubits. Science 360, 195{199 (2018).

[27]Bremner, M. J., Montanaro, A. & Shepherd, D. J. Average-case complexity versus approximate simulation of commuting quantum computations. Phys. Rev. Lett. 117, 080501 (2016).

[28]Bouland, A., Fe erman, B., Nirkhe, C. & Vazi- rani, U. Quantum supremacy and the com- plexity of random circuit sampling. Preprint at (2018).

[29]See supplementary information .

[30]Vool, U. & Devoret, M. Introduction to quantum electro- magnetic circuits. Int. J. Circ. Theor. Appl. 45, 897{934 (2017). 8

[31]Chen, Y. et al. Qubit architecture with high coherence and fast tunable coupling circuits. Phys. Rev. Lett. 113, 220502 (2014).

[32]Yan, F. et al. A tunable coupling scheme for implement- ing high- delity two-qubit gates. Phys. Rev. Applied 10, 054062 (2018).

[33]Schuster, D. I. et al. Resolving photon number states in a superconducting circuit. Nature 445, 515 (2007).

[34]Je rey, E. et al. Fast accurate state measurement with superconducting qubits. Phys. Rev. Lett. 112, 190504 (2014).

[35]Chen, Z. et al. Measuring and suppressing quantum state leakage in a superconducting qubit. Phys. Rev. Lett. 116, 020501 (2016).

[36]Klimov, P. V. et al. Fluctuations of energy-relaxation times in superconducting qubits. Phys. Rev. Lett. 121, 090502 (2018).

[37]Yan, F. et al. The ux qubit revisited to enhance coher- ence and reproducibility. Nat. Commun. 7, 12964 (2016).

[38]Knill, E. et al. Randomized benchmarking of quantum gates. Phys. Rev. A 77, 012307 (2008).

[39]Magesan, E., Gambetta, J. M. & Emerson, J. Scalable and robust randomized benchmarking of quantum pro- cesses. Phys. Rev. Lett. 106, 180504 (2011).

[40]Cross, A. W., Magesan, E., Bishop, L. S., Smolin, J. A. & Gambetta, J. M. Scalable randomised benchmarking of non-cli ord gates. NPJ Quantum Information 2, 16012 (2016).

[41]Wallra , A. et al. Approaching unit visibility for control of a superconducting qubit with dispersive readout. Phys. Rev. Lett. 95, 060501 (2005).

[42]De Raedt, H. et al. Massively parallel quantum computer simulator, eleven years later. Comput. Phys. Commun. 237, 47 { 61 (2019).

[43]Markov, I. L., Fatima, A., Isakov, S. V. & Boixo, S. Quantum supremacy is both closer and farther than it appears. Preprint at (2018).

[44]Villalonga, B. et al. A exible high-performance sim- ulator for the veri cation and benchmarking of quan- tum circuits implemented on real hardware. Preprint at (2018).

[45]Boixo, S., Isakov, S. V., Smelyanskiy, V. N. & Neven, H. Simulation of low-depth quantum circuits as complex undirected graphical models. Preprint at (2017).

[46]Chen, J., Zhang, F., Huang, C., Newman, M. & Shi, Y. Classical simulation of intermediate-size quantum circuits. Preprint at (2018).

[47]Villalonga, B. et al. Establishing the quantum supremacy frontier with a 281 p op/s simulation. Preprint at (2019).

[48]Pednault, E. et al. Breaking the 49-qubit barrier in the simulation of quantum circuits. Preprint at (2017).

[49]Chen, Z. Y. et al. 64-qubit quantum circuit simulation. Sci. Bull. 63, 964{971 (2018).

[50]Chen, M.-C. et al. Quantum teleportation-inspired al- gorithm for sampling large random quantum circuits. Preprint at (2019).

[51]Shor, P. W. Scheme for reducing decoherence in quan- tum computer memory. Phys. Rev. A 52, R2493{R2496 (1995).

[52]Devoret, M. H. & Schoelkopf, R. J. Superconducting circuits for quantum information: An outlook. Science 339, 1169{1174 (2013).

[53]Mohseni, M. et al. Commercialize quantum technologies in ve years. Nature 543, 171 (2017).

[54]Grover, L. K. Quantum mechanics helps in searching for a needle in a haystack. letters 79, 325 (1997).

[55]Bernstein, E. & Vazirani, U. Quantum complexity the- ory. Proc. 25th Annual ACM Symposium on Theory of Computing (1993).

Sometimes translation software provides unique insights

When asking youth about the PLA blowing up stuff…

In China the tradition of using hard subs so that no subversive messages can be later inserted sometimes leads to some ‘funny’ situations.

As you may have heard there has been numerous protests here again over the extradition bill, along with the lack of universal suffrage, to outright collusion with the police & the triads. Now the PLA is getting in on the messaging, by doing a promo video featuring the Hong Kong garrison reminding us that they have all various calibers of machine guns, armored vehicles, boats, helicopters and rockets to subdue unarmed civilians.

At the 2:12 mark these kids are saying ‘So fierce’ and of course the translation slips through what the video is really about.

With all the ridiculousness of the past month, I really cant see the government taking us to the point of Martial Law. But there is always that possibility the last month has been anything but typical.

Moving offices again

Things are going well, and I’ve outgrown the old place. So time to move.

I’m super lucky, there is no denying that. So to push my luck I’m giving myself the corner office.

Unfortunately the prior tenant believes that masking tape X’s in the windows makes them stronger and will prevent them from breaking during a typhoon. I cannot believe how many people try to tell me that paper tape is somehow going to catch shards of glass being propelled upwards of 200km/hr.

It’s all exciting to me as the success is not only with my company, but its not an IT company either. It’s such an interesting thing being thrown into a different field although many of the challenges oddly enough remain the same.

Anyway all the hosted stuff is obviously offline. I think I’m getting a different public address, to further complicate things.

So yes, I’ve been busy