February 2024 BenGoldhaber.com Newsletter

Cultivating headspace, automating forecasting, and keeping score.

Mar 06, 2024

Update on my quest to become a Bay Area Cliché: The past two months I have been meditating about 45 minutes a day.

This is a big change for me; while I’ve on-and-off meditated since ~2015, its previously been for 15-20 minutes at a time. On the advice of an experienced meditator I started sitting longer, and indeed there was a really noticeable step change in the returns to meditation between the 20 and 30 minute mark.

Benefits I’ve found:

Generally smoother, less ‘jerky’ cognition.
I feel less reactive and distracted.
Easier to consciously put my attention on difficult or tricky topics.
Overall, more present.

And, it’s great having a cornerstone activity I do everyday that is about returning to internal balance. I finally get why people like yoga.

I’ve been using The Mind Illuminated as my guide and I highly recommend it. It engages deeply with the theory and the practice, while grounding it in a scientific modality. Plus it has fun pictures:

The north star for me has been a goal of cultivating more ‘headspace’. I alluded to this in my year in review post:

Being able to clearly let my attention focus on what I actually want in an exchange avoids the all too common scenario where I’m trying to accomplish a secret objective - get someone to like me, minimize conflict, etc.

Similarly, and poetically, from the Mind Illuminated:

When the mindfulness of a samurai warrior fails, he loses his life. When we lack mindfulness in daily life, something similar happens. We become so entangled in our own thoughts and emotions that we forget the bigger picture. Our perspective narrows, and we lose our way.

These reference the ways that I, as - lets face it - a modern day Samurai, make more mistakes without headspace, but on the flip side having spaciousness and mental clarity seems pretty clearly to be a good way to have more fun and joy in life.

I’ve copied into a Google Doc my working notes, handles and ideas related to Meditation.

Other notable cliche Bay Area activities - I went skiing for the first time ever in February. Pleasantly surprised I could ‘do the skiing move’ and go down a bunny slope after only a couple of hours.

I now think of skiing as a fundamental part of my identity.

Last February I mentioned my excitement about language models being used for applied forecasting.

Individual forecasters and prediction markets/tournaments will generate the knowledge graphs and training data that LLMs will consume. In turn, LLMs will generate potential explanations and models underlying the questions, which forecasters can use as an aid in thinking through other questions, and which the public will be able to query to finally achieve the dream of applied forecasting: the ability to ask and receive an answer to the crucial question of ‘what’s going to happen’.

A year later, Jacob Steinhardt has released a paper on language models approaching human-level forecasting.

On average, the system nears the crowd aggregate of competitive forecasters, and in some settings surpasses it. Our work suggests that using [language models] to forecast the future could provide accurate predictions at scale and help to inform institutional decision making.

This is better performance on the actual act of forecasting than I expected. I imagined language models being useful for generating potential explanations of the estimates created by human forecasters, but it seems that an end to end system of news retrieval + fine tuning can generate high-quality forecasts without needing the humans.

We find that our system performs best relative to the crowd on the validation set when (1) the crowd is less confident, (2) at earlier retrieval dates, and (3) when it retrieves many articles. Furthermore, we find that our system is well-calibrated. First, our system significantly outperforms the crowd when the crowd’s predictions express high uncertainty. Specifically, when the crowd’s predictions are between .3 and .7, our Brier score is .199 compared to the crowd’s .246. However, our system underperforms the crowd on questions where they are highly certain, likely because it rarely outputs low probabilities.

And, on its heels, a paper from Tetlock showing that language models, when grouped together, match human crowd accuracy. A simpler setup than Steinhardt’s paper, but still effective.

This seems like it should be a major, huge event - AI is about as good as humans forecasters. Just for starters, we could automatically generate calibrated forecasts for every geopolitical question in every news article or every academic paper, reshaping our knowledge economy.

But while my first reaction is to expect a revolution in forecasting from this, experience has taught me it will take longer than I, ironically, predict. The general pattern of AI the past four years has been these crazy improvements on academic benchmarks, with incredible demos, but then seemingly not having the impact in the world immediately that you’d expect from it. How we actually use these innovations require a lot of tweaking and fiddling on the human-computer interface front.1

I expect (75%) to hear that at least one hedge fund is using this type of system in a serious (not just for clickbait or as an IBM Watson demo) to augment its traders before the end of the year.

Gen-Z is using deepfakes to teach each other calculus. This is far more compelling than most of the calculus classes I remember - Kim and Kanye really seem to care about the Math.

I’m of two minds about this use of deepfakes. On the one hand, we’re surrounded by content that is optimized for dopamine hits and distraction; at the very least this is including in that optimization function the watcher’s educational goals. On the other hand, I worry about AI persuasion risks, and competition to make ever more fake humans hook the viewer is not a good path to go down (also while it’s been a long time since I took calculus, I’m pretty sure this isn’t a correct explanation).

Ultimately, even though it’s a trite example, it’s still a deepfake that was created non-consensually, and I agree with and have signed this letter calling for disrupting the deepfake supply chain. I expect the future of entertainment will look like licensing personas to consensual deepfake video generation, and hopefully we’ll develop the right type of cultural norms to balance the AI and video.

We’re in another crypto (bubble/supercycle), and as someone who has been around crypto for a while, I occasionally get asked what my views are. I think crypto is:

Good for defying government regulation (aka buying drugs online, circumventing capital controls)
Satisfying the fundamental human need to pump and dump tokens that are named after dogs.

If you want to make a portfolio based on this, I recommend having some exposure to crypto that has strong crypto-anarchist cred (bitcoin, monero), and some exposure to crypto that is really stupid. Like, the stupider the better.

#good-content

Helldivers 2: The first video game I’ve played in a while. It’s a very fun, co-op action shooter. I appreciate that you’re playing against bots, since I would have no chance against the fast-twitch reflexes of a 14 year old. The Starship Troopers aesthetic is fun too.

The Score will Take Care of Itself: After a tough 49ers super bowl loss, I read the classic book on leadership from Bill Walsh, legendary 49ers head coach.

I like books like this not because I’m likely to learn something novel - obviously, work ethic matters, focus on the details - but because the aesthetic vibe of a very different kind of person bleeds through, and you can pick up on it, and maybe integrate some of it; or at least appreciate it.

The level of micro-management and obsession with perfect execution is on some level inspiring.

Accuracy, accuracy, precision in execution of everything at all levels. No sloppiness. Game-level focus was the price of admission.
Exhibit a ferocious and intelligently applied work ethic directed at continual improvement
Champions behave like champions before they’re champions; they have a winning standard of performance before they are winners

I suspect integrating the Bill Walsh philosophy, with the Meditation philosophy, is the Way2.

Ben

That’s one reason I’ve been so excited about the success of Manifold - another thing I didn’t predict - they’ve built a community ‘playing around’ with the art of forecasting, which seems to be the way to make actually useful things.

look I promise not to get weird with this meditation stuff. Next month you’ll get more twitter links and the personality quizzes that I know my newsletter readers love.

Ben Goldhaber's Newsletter

Discussion about this post