The 65th episode of Datacast is my conversation with David Sweet— an experienced quantitative trader and machine learning engineer who has used experimental methods to tune large-scale trading and recommendation systems.
Our wide-ranging conversation touches on his educational background in Physics and Ph.D. work on chaos theory; his work on open-source software and open content in the early 2000s; his Wall Street quant career dabbling across hedge-fund management, investment banking, and cryptocurrency trading; his work on recommendation systems at Instagram; his book “Tuning Up” that explores experimental optimization methods, and much more.
Please enjoy my conversation with David!
Listen to the show on (1) Spotify, (2) Apple Podcasts, (3) Google Podcasts, (4) TuneIn, (5) RadioPublic, (6) Stitcher, and (7) Breaker.
Key Takeaways
Here are highlights from my conversation with David:
On Studying at Duke University
I arrived at Duke on the year when they won the NCAA championship. Duke has a lot of school spirit, so it was insanely jubilant. The way Duke people celebrated was to take all the benches off the main quad and threw them into a bonfire. In general, Duke has a strong on-campus social life. I also got to learn about North Carolina barbecue. In the Northeast, where I’m from, barbecue is the event. In North Carolina, barbecue is the food — pulled pork and hash potatoes.
I got a BS in Physics and BA in Math. A nice thing about the Duke was the opportunities to do things beyond classes. For example, I did an independent study with Richard Palmer on neural networks and reinforcement learning, which were fairly new back in the 90s.
My favorite undergraduate Physics course was a seminar on General Relativity. It’s a fascinating topic and hard to grasp. However, in a seminar format, we had many group discussions, which provided a nice combination of intellectual and social interactions.
On Getting a Ph.D. in Physics at University of Maryland, College Park
For me, the question at the time would have been: “why not pursuing a doctorate?” I enjoyed Physics, and there was a lot more to learn. It was as natural as going from freshman to sophomore year.
Chaos theory was a popular topic in the 80s. In popular media, you would have seen fractals — images that are associated with chaotic systems. For example, chaos theory was featured in Jurassic Park. The defining characteristic of chaotic systems is expressed as sensitive dependence on initial conditions. Those initial conditions where you start the system might have looked the same to you, but there will be slight discrepancies in any realistic setups. Those discrepancies will grow explosively, so in any finite amount of time, you’ll observe the differences no matter how infinitesimal it was when you started. But the behavior of the system itself is bounded.
We want to talk about a state space in physics — where we can describe physical systems by variables such as position and momentum. In the late 90s, lots of work had been done on systems with few such variables because the computers back then weren’t capable of doing the simulations to study those with higher degrees of freedom. As computers got faster, I studied higher-dimensional version cognates of lower-dimensional results.
On His Nature Paper about “Topology in Chaotic Scattering”
This paper includes a geometrical optics experiment, meaning that it’s a study of light rays bouncing off the universe.
Imagine that you have a 12-inch diameter sphere mirrored on the outside. You set it down on the table in front of you. Then you get two more spheres and place them next to the first sphere, such that all three of them are uniformly touchy (each of the spheres touches the two other spheres). Finally, let’s stack on top of that the fourth sphere like a cannonball. The center of the spheres is on the vertices of the tetrahedron/pyramid.
We then took this apparatus into a research lab. On three of the four faces of that pyramid (where a face is a triangle connecting the center of three spheres), we placed color poster-boards in front of them and showed light on the poster-boards. That light reflected off the sphere and entered an inner chamber that the four spheres made.
Looking inside, we saw a fractal with colored patches of red, white, blue, and black scattering around. The boundaries between the colors are fractal-based and called Wada. This is the first physical realization of such a fractal-based kind. It arose from computer simulations on similar-looking systems.
Another outcome of this experiment was a pretty physics picture. When published in Nature, the figure from the paper was made the cover of that Nature magazine issue. As a result, more people became aware of our work — using it to demonstrate chaos and fractals in undergraduate labs and become a part of the Tetrasphere exhibit at the National Museum of Mathematics in NYC.
On The K Desktop Environment
The K Desktop environment is an open-sourced GUI desktop for Linux, including the menus, the windows, the session management, the utilities, etc. As the KDE project emerged into version 2.0, many developers came to the mailing list and showed interest in participating, but they needed to get set up. So I decided to put together a how-to manual to go from nothing to contribute to the project. It was very well-received.
Because of that, I got contacted by someone who was writing a book about KDE and needed a chapter on the KDE code. He asked me to write that chapter, essentially expanding my how-to manual. I did that to get my feet wet in writing. Then I thought maybe there’s more to that. Maybe I could write a whole book about KDE development. I pitched that idea to MacMillan Publishing, and they provided the support, resulting in a published book. The first half was on version 1.0, while the second half was on version 2.0. The book was definitely a group effort of many KDE developers.
On Open Content via Andamooka
In the early 2000s of the Internet, open-source has become quite popular. The specific problem is that: if you wrote documentation for some open-source software and that documentation was closed-source, the software couldn’t evolve. So people were discussing open licenses for documentation. One of the licenses is called OPL — Open Publication License. I asked MacMillan when the book could be released under this license as an experiment. They were on board. We printed the book and sold it while also having the PDF version freely available that anyone can download, modify, and redistribute with attributions.
The PDF was hosted on Andamooka, built under the Flash content management system. I then found other authors who had published OPL books and invited them to put their books on the site. One of the authors is Jimmy Wales, who was working on something called Nupedia (the precursor to Wikipedia).
On Becoming a Quant Analyst
I went into the industry because I was interested in building things. I love the novelty in research and found that satisfied. But building automated systems in the industry is even more fascinating to me. A trading strategy is an autonomous vehicle because it doesn’t require you to build things with your hand but with software.
So what happened when I entered the industry after getting a Ph.D.? It was an amazing shift. All my learnings had been in an incremental progression up until that point — each new year was built upon the previous year.
Going into the industry was like going back to square one: I had no finance, engineering, or economics expertise. It wasn’t because I had skipped steps in a steady progression or fell backward. It’s just that I got kicked off to the left field, in a sense.
I had underestimated how dramatic that change would be. It was tough but exciting at the same time. There was so much more to learn, study, or know that I needed to learn on the fly — either on my own or with my peers, not in a controlled environment like a classroom.
As a scientist, I had learned to build free-parameter models, in which I could analyze them by hand or by simulations. But for an online trading strategy, I needed to tune parameters to the real world. There were good answers, and I needed to find them, which led me to learn about experimental methods.
On Working at Lehman Brothers
Lehman Brothers was a huge bank. 30,000+ people worked there during my time in the mid-2000s. So on the trading floor of the building in Times Square where I worked, about 500 people were trading — lots of yelling on the phone, lots of what to buy and what to sell, etc.
For me, as a quant sitting in front of a computer, it felt like doing my math homework at a football game — not the most productive environment as you can imagine. So that was tough to get used to.
I worked on systems where positions were held intra-day. In my last year at Lehman, I worked with a small group of 4-to-6 people building a high-frequency trading system. There, the holding time was even shorter, maybe a minute or so. Then, I discovered companies like GETCO and TradeBot who were already doing these things successfully. While I loved the problem, I was not too fond of the idea of reinventing it. I knew the problem had been solved to a large extent elsewhere. That precipitated my transition to GETCO.
On Contextual Bandits at Teza Technologies
At Teza, I worked on execution algorithms for high-frequency timescales. At the time, we were short-staffed for this project and were interviewing people to work on A/B testing. At such short timescales, we need to understand what impact a cause might have on the execution.
While interviewing people, I was also looking at ways to get this thing up and running in an automated and efficient manner. I ended up learning about contextual bandits, which have three parts: (1) modeling of the cause / continuously refitting, (2) exploring different kinds of trade under new conditions, and (3) de-biasing the model from certain policies. Contextual bandits ran fairly autonomously, so I could work on improving other pieces of our execution ability.
On Cryptocurrency Trading
Unfortunately, Teza got shut down. While looking for something new, I connected to two traders/entrepreneurs running a trading shop. One of the investors is Mike Novogratz, who co-founded one of the first public hedge funds named Fortress. He had taken an interest in cryptocurrency and was looking to fund people who traded them. I had been interested in crypto and traded it by hand for a while. However, nobody in finance really took it seriously. So I was very surprised to hear that somebody in traditional trading looked to fund cryptocurrency trading.
Eventually, the three of us formed a company called Belverade Trading and started trading live in November 2017 (just when cryptocurrency started getting crazy). Those were a few months of sleepless nights — building, repairing, and improving our system as it was trading. It was a wild ride and one of the best experiences of my professional life.
Mike Novogratz also funded other cryptocurrencies within his ventures at the time. Each of these looked like a department within an investment bank. He decided to combine all of them and formed a single company. As a result, Belverade Trading was re-branded Galaxy Digital Trading. One of my partners left early to run a former business. I stayed for about 6 months to integrate the team and software that I had built into a mature and stable level.
Beyond the experience of being an entrepreneur and creating something of my own, doing it in the crypto community at that time was awesome. There was just a bunch of really passionate and optimistic people. There were no days gone by in which I was not inspired.
On Optimization of HFT Systems
I gave these lectures once a year (for a few years) at Vasant Dhar’s “Trading Strategy Systems” class at NYU Stern School of Business. The idea was to teach experimental optimization methods used in quantitative trading. I want to talk about the pros and cons of simulations and experiments and how they could work well together.
The nice thing about taking experimental measurements is that you are looking at a real system in the real world, where your data is unbiased. The drawback is that data collection is slow, expensive, and risky.
Simulation, on the other hand, is very precise. You can run over a lot of historical data with different configurations. It is cheap, and real money is not at risk. However, simulation is a model, and any model is going to have some level of bias.
They do work well together: you take precision from simulations and accuracy from experiments.
Most of the lectures also discussed different methods used for experimental optimization (A/B testing, multi-arm bandit response surface methodology, contextual bandits, Bayesian optimization, etc.) and showed how they are built on top of each other.
On Working as a Machine Learning Engineer at Instagram
I observed a growing gap in ML techniques used in finance vs. techniques in technology companies such as Google, Apple, Facebook, Uber. As these technologies became more gigantic, they were able to open basic research labs within their companies. These labs have produced cutting-edge algorithms/ideas and pushed them to the applied side. I had not seen finance companies having that level of innovation. Was that because these techniques are not applicable?
I was learning by reading papers and running little side projects on my own. But if you have ever taken a math class, you read the chapters, and you think you understand. But when you went ahead and did the homework problems, then you figured that you didn’t really understand at all. There’s something about hands-on practices that increase your understanding of whatever the topics might be.
So what I wanted was to go to one of these tech companies and use the ML techniques (that I was learning about) in practice. Hopefully, it could build confidence in my ability to predict whether these techniques would be useful in any given context.
At Instagram, I worked on Feed and Stories. If you use the Instagram app, the streaming of pictures is the feed, and the circles on the top are the temporary Stories. My team worked on the recommender system, deciding which images/stories to show to the users and the ranking order. A couple of nice things happened:
I got the opportunity to work with recurrent neural networks, counterfactual policy estimation, and Bayesian optimization.
Furthermore, at such a large company with so many experts in different algorithmic topics, I could sometimes ask deep questions from a person who wrote a paper that I had read. I could hit them up on Facebook’s internal communication tool, and they were friendly to respond.
On His New Book “Tuning Up”
A big part of the research for the book has been interviewing colleagues and friends of friends from finance and tech companies.
What I found out is that people sometimes don’t want to talk at all. Or they talk but don’t want to be named. This is because they are afraid of accidentally saying something proprietary.
Given my background in open-source software and the open publication that I used to run, I always thought that openness is a good thing, at least in scientific endeavors. You share information, everybody can build on everything else, and there’s less wasted time reinventing the wheel.
I wonder if the industry has the right level of transparency. When people are afraid to talk, I wonder if there’s more to it than just maybe some efficiency due to the lack of transparency. Maybe there’s a toll taken on people if they can’t talk.
The pandemic came out of the blue a few months into writing the book.
Things have stabilized, but for a while, everything was changing. It was hard to sit down and focus on writing the book.
But otherwise, Manning has been super supportive in both professional and emotional ways. Likewise, my editor is relentless in the kindest and most helpful way.
On The Benefits of His Physics Degree
Some basic skills like math, programming, modeling simulations, etc., certainly help with engineering. But I didn’t have any finance or engineering background, so many of the specifics were missing.
But what’s has been the most helpful for my career is scientific thinking.
Studying physics or any other scientific subjects will teach you how to make a claim and then find ways to refute your own claim. You want to break your own ideas. Then, for whatever survives, you will have confidence in them. This is hard to do because this mindset doesn’t come naturally or is intuitive to human beings.
Bias towards simplicity (Occam’s razor) also gets hammered to me throughout my academic career, which is very useful in the industry.
Show Notes
(01:59) David recalled his undergraduate experience studying Physics and Mathematics at Duke University back in the early 90s.
(05:55) David reflected on his decision to pursue a Ph.D. in Physics at the University of Maryland, College Park, specializing in Nonlinear Dynamics and Chaos Theory.
(10:18) David unpacked his Nature paper called “Topology in Chaotic Scattering.”
(14:43) David went over his two papers on fractal dimensions in higher-dimensional chaotic scattering following his Nature publication.
(21:42) David talked about his project K Desktop Environment, which provides a free, user-friendly desktop for Linux/UNIX systems (later turned into a print book with MacMillan Publishing in 2000).
(24:20) David explained the premise behind his work on Andamooka, a site that supports open content.
(27:24) David walked over his time as a quantitative analyst at Thales Fund Management after finishing his Ph.D.
(30:50) David discussed his 4-year stint at Lehman Brothers — moving up the ladder into a Vice President role, up until Barclay’s Capital acquired it.
(33:24) David talked about his proudest accomplishment during the 5-year stint as a headdesk in equities trader at KCG/GETCO.
(35:37) David shared war stories while working at an investment firm called Teza Technologies and co-founding Galaxy Digital Trading (specializing in cryptocurrency trading).
(41:34) David unpacked key concepts covered in his guest lectures on optimization of high-frequency trading systems at NYU Stern School of Business.
(44:26) David explained his career change to work as a Machine Learning Engineer at Instagram in the summer of 2019.
(47:17) David briefly mentioned his transition back to a quant trader role at 3Red Partners.
(48:05) David is writing a technical book with Manning called “Tuning Up,” which provides a toolbox of experimental methods that will boost the effectiveness of machine learning systems, trading strategies, infrastructure, and more.
(50:48) David reflected on the benefits of his physics academic background for his quant analyst career.
(52:27) Closing segment.
David’s Contact Info
Mentioned Content
Publications
“Topology In Chaotic Scattering” (Nature, May 1999)
“Fractal Dimension of Higher-Dimensional Chaotic Repellors” (June 1999)
“Fractal Basin Boundaries in Higher-Dimensional Chaotic Scattering”
Book
“The Elements of Statistical Learning” (by Trevor Hastie, Robert Tibshirani, and Jerome Friedman)
People
Jim Simons (Founder of Renaissance Technologies)
Michael Kearns (Professor at the University of Pennsylvania, previously leading Morgan Stanley’s AI Center of Excellence)
Vasant Dhar (Professor at NYU Stern School of Business, Founder of SCT Capital)
Tuning Up — From A/B testing to Bayesian optimization
Manning’s permanent 40% discount code (good for all Manning products in all formats) for Datacast listeners: poddcast19.
You can refer to this link: http://mng.bz/4MAR.
Here are two free eBook codes to get copies of Tuning Up for two lucky Datacast listeners: tngdtcr-AB2C and tngdtcr-6D43
You can refer to this link: http://mng.bz/G6Bq.
About the show
Datacast features long-form conversations with practitioners and researchers in the data community to walk through their professional journey and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths - from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:
If you're new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.