How cost-effective are AI safety YouTubers?
Early work on ”GiveWell for AI Safety”
Hey! Austin here. Some of Manifund’s most popular projects have been videos on AI safety. We may push more in this direction — eg running a content creator fellowship out of Mox. At the same time, I’ve wanted better estimates of the impact of work we fund. Regrantor Marcus Abramovitch and I have started looking into data on these videos, and are excited to share early results. Here’s Marcus:
Intro
EA was founded on the principle of cost-effectiveness. We should fund projects that do more with less, and more generally, spend resources as efficiently as possible. And yet, while much interest, funding, and resources in EA have shifted towards AI safety, it’s rare to see any cost-effectiveness calculations. The focus on AI safety is based on vague philosophical arguments that the future could be very large and valuable, and thus whatever is done towards this end is worth orders of magnitude more than most short-term effects.
Even if AI safety is the most important problem, you should still strive to optimize how resources are spent to achieve maximum impact, since there are limited resources.
Global health organizations and animal welfare organizations work hard to measure cost-effectiveness, evaluate charities, make sure effects are counterfactual, run RCTs, estimate moral weights, scope out interventions, and more. Conversely, in AI safety/x-risk/longtermism, very few efforts are spent on measuring impact. It’s hard to find anything public that compares the results of interventions. Perhaps funders make these calculations in private, but one of the things that made Givewell so great was that everything was out in the open. From day 1, you could see Givewell’s thinking on their blog, examine their spreadsheets, and change parameters based on your own thinking.
I’m making a first attempt at a “GiveWell for AI safety”. The end goal is to establish units to measure the cost-effectiveness of AI safety interventions, similar to DALY/$ in global health. Scott Alexander’s survey from last year is a great example of different kinds of relevant units for AI safety work. Given units like these, a donor or funder can look at the outputs of different AI safety orgs and “buy” certain things for far cheaper than they can for others.
To begin, I’m looking into the cost-effectiveness of AI safety communications. I’m starting with communications because it’s easiest to get metrics, and the outputs of comms work are publicly viewable (compared to e.g., AI policy work) and more easily assessable (compared to e.g., technical AI safety work). For this post, I’m going to be specifically focused on YouTube videos, where metrics were easiest to gather; I’m introducing my framework for evaluating different YouTube channels, along with measurements and data I’ve collected.
Step 1: Gathering data
The end goal metric I propose for AI safety videos' cost effectiveness is “quality-adjusted viewer-minute, per dollar spent”. I started by collecting data for channel viewership and costs.
Viewer minutes
To calculate time spent watching videos, I first wrote a Python script to query the YouTube API for views and view lengths for every video in specific channels. I multiplied these, and adjusted by a factor for the average percentage of a video watched: 33%, based on conversations with creators. I also messaged creators asking them for their numbers, using those directly where possible. For any creators that responded, I included their metrics (which were usually screenshotted, for authenticity).
Costs and revenue
To measure cost-effectiveness, we also need to estimate the costs of making the videos. In addition to direct costs (such as equipment and editing), I consider the value of time by people producing videos to be a major cost. This is because they could be doing other things, such as earning to give. Because of this, I generally ask people to estimate their market rate salaries if they are doing the work unpaid. For example, for AI in Context, this included the salaries paid to 80000 Hours employees for the production of their video. For other creators who had lots of personal savings and weren’t getting paid, I asked them to include the value of their time.
Some channels and podcasts also produce revenue through ads and sponsorships. This is a very good thing and is a sign that people want to see the content. In fact, I expect the best content will be able to self-fund after some time, and even be profitable. That said, for now, most aren’t profitable and subsist on donations; thus, I count revenues as offsetting costs for cost-effectiveness calculations, because this funding was produced organically, though I’m open to treating this differently.
Results
Before we get into quality adjustments, here’s a snapshot of where different channels stand:
Step 2: Quality-adjusting
For quality adjustment, there are three factors I introduce: the quality adjustment for the audience, Qa, the quality adjustment for the fidelity of the message, Qf, and the alignment of the message, Qm. So the overall equation is:
Quality-adjusted viewer minute = Views × Video length × Watch % × Qa × Qf × Qm
Quality of Audience (Qa)
This quality adjustment is meant to capture the average quality/importance of the audience. The default audience is set to 1 as if the average person watching is an average person (totally random sample from the human population). If an audience member is more influential, has more resources, or is otherwise more impactful on the world, this number goes up; vice versa if it goes down. At the extreme, you might think of a value of 0 for someone who is about to die alone on their deathbed right after they watch the video, and perhaps as high as 1,000,000,000 for an audience of just the President of the US. Other things that make a viewer valuable are things like being at a pivotal time in their career, being extremely intelligent, etc.
This is comparable but not directly correlated with CPM (cost per thousand views, aka how much advertisers would pay to advertise to this person). This value could, in theory, be negative (it'd be better for this person not to watch this video/we'd pay for a person who was about to watch the video to have their internet shut down and the video not load), though I’d suggest ignoring that.
Normal values for this factor will be between 0.1 and 100, but should center around 1-10.
Fidelity of Message (Qf)
This refers to how well the message intended for the audience is received by the audience, on average, across all viewer-minutes. It attempts to measure the importance of the message and how well the message is conveyed for your goal. If your goal is to explain instrumental convergence or give the viewer an understanding of mechanistic interpretability, how well the video conveys the message is what is being measured here, alongside how important that message is.
An intuitive way to grasp this metric is to consider how much you’d rather someone watch one minute of a certain video compared to a reference video. For now, I am somewhat arbitrarily setting the reference video to be the average of Robert Miles’ AI safety videos. I’m seeking a better reference video; perhaps one that is more widely known or is simply considered to be the canonical AI safety video. If you would trade off X minutes of watching the video in question for 1 minute of the average Robert Miles video, then the Qf factor is 1/X.
Normal values of this number for relevant videos will be between 0.01 and 10.
Alignment of Message (Qm)
This factor refers to the message being sent relative to your values. This value will range from -1 to 1 and where a value of 1 is “this is the message I most want to get across to the viewers” and a value of -1 is “this is the exact opposite of the message I want to get across to viewers. For example, if your most preferred message to get across is “change your career to an AI safety career” and the message a particular video portrays is “pause AI” which you prefer half as much, Qm for this video is 0.5 and if the exact opposite message you want to portray is “accelerate AI as fast as possible”, you’d give a value of -1.
Importantly, this is perhaps the most subjective of the factors and depends greatly on your values
Results
Here are my results:
Feel free to make a copy of the Google sheet, insert your own values, play around with it, and compare things, as you can with Givewell’s spreadsheets. I intend to update the master spreadsheet with info as I receive it (costs, viewer minutes, etc.), so I don’t recommend you change non-subjective values. I’ve added comments to explain the estimates I made in the Google Sheet. I’m seeking recommendations for the best way to capture estimates vs. reported data, or any other suggestions.
Observations
The top cost-effectiveness comes from creators that are monetizing their content (AI Species and Cognitive Revolution) as well as well-produced videos of typical YouTube length (5-30 minutes), and not from long podcasts or short-form videos.
We’re seeing that for 1 dollar, good AI safety YouTube channels generate on the order of 150-300 QAVM (or about 2.5-5 hours).
Some things to compare this to.
Global average income is about $12k/year or about $6/hr. We could pay people this wage to watch videos; this would be 10 viewer-minutes per dollar.
We could pay to promote existing videos on YouTube. CPM on YouTube videos is about $0.03/view of about 15 seconds, which would be $0.12/minute or about 8 VM/$
We could pay to show video ads in other locations (eg, Super Bowl ads). A 30-second ad on Super Bowl LIX would have cost $8M and been shown to 120M US people, or ~8 viewer-minutes per dollar - with perhaps a 3x audience quality adjustment for primarily US viewers, you get 24 QAVM/$.
Of course, these comparisons don’t include the cost of video production itself.
Other notes from this exercise:
Over the course of this, I asked a lot of people, formally and informally, who they thought of as AI safety communicators, YouTubers, etc. Essentially, everyone said Robert Miles was the first that came to mind, and a few said that they have careers in AI safety, at least partially due to his videos. This led me to make the “audience quality” category and rate his audience much higher.
Here I am measuring average cost effectiveness, though we probably want to be doing this at the margin. While this is a different exercise, I think it should still be illuminating and should serve as a good benchmark we should be aiming for.
Where is Dwarkesh? Austin and I argued for a while on this — Austin thinks Dwarkesh should be included as someone whose channel reaches important people, while I think that very few of Dwarkesh’s videos count towards AI safety. I informally surveyed a bunch of people by asking them to name AI safety YouTubers, and Dwarkesh’s name never came up. When mentioned, he was not considered to be an “AI safety YouTuber”.
Many creators seemed to want me to include their future growth. I think this is perhaps a bit too subjective and would introduce a lot of bias. These view counts are a snapshot at this point in time, but when deciding what to fund, you should also consider growth rates and look to fund things that could reach a certain cost-effectiveness bar.
I don’t think that this post should cause large-scale shifts in what gets funded or what people do, but I do think cost-effectiveness is one of the things people should be looking at for the projects they pursue
How to help
The main thing I need is data, both of metrics and of costs, including the salaries of people spent making these videos. A lot of data is particularly hard to come by, and otherwise, I make estimates. Thanks to all the people who responded to my emails and text messages and gave me data already!
For next steps, I am planning to expand this into all media of AI communications (podcasts, books, articles, signed letters, article readings, website visits), collect data on metrics and costs, and come up with quality adjustments. If you make content or work for an org that has data, please message me at marcus.s.abramovitch@gmail.com.
The end goal of this remains to cross-compare different “interventions” in AI safety, like fieldbuilding (MATS) and policy interventions (policy papers, lobbying efforts,) and research (quality/quantity of papers). Stay tuned!
Appendix: Examples of Data Collection
Rob Miles
The Donations List Website indicates Rob has received ~$300k across 3 years from the LTFF. Since Rob has been making videos for ~10 years, I am estimating it takes ~$100k/year for a total of $1M. I used my scraper and estimated here of ~33% average watch time of video per view for ~31.6M viewer minutes. I will update this one as soon as Rob responds to me, since Rob is considered the canonical AI safety YouTuber.
AI Species (Drew Spartz)
Drew told me he spent $100k for all the videos on his channel, and including his time (~1 year at $100k/year), $200k would be fair for total resources spent. I used my script to pull views for each video, and estimated that about 50% of each view was a full watch. I sent this to Drew, and he told me 30-35% was more realistic, and then he gave me his raw data. This was very helpful since it allowed me to calibrate % watch time for various video lengths.
Rational Animations
I looked at Open Phil grants, which totaled $2.785M. FTX FF gave a further $400k. I then added ~10% for other funding from individuals or other grantmakers that didn’t show up. I scraped their data for views from the YouTube API and used a 33% watch-through rate.
After this, Rational Animations confirmed to me that they in fact spent more and received more grants, which summed to $4,395,132.
AI in Context
Aric Floyd told me that 80k spent $50.75k on the video and that staff time on the video summed to ~$75.2k for a total of $126k. He also sent me his watch data for the video.
Cognitive Revolution
I asked Nathan, and he said I was approximately correct that $500k ($250k of which is on production and ~$250k paid to him as salary) is about what has been spent over the last couple of years for production. He also shared his YouTube data and suggested that he gets more engagement from audio-only. (Not to worry, Nathan, I’m analyzing this next. It’s just hard to get data.)
Cognitive Revolution is unique in that they have very substantial revenues because of sponsorships and YouTube views. This somewhat breaks my formulas for cost-effectiveness since donations aren’t required. In other words, Cognitive Revolution makes money. Therefore, I’m conservatively estimating the value of Nathan’s time here to be a bit higher, given his experience. Thus, $250k is spent on production, and $250k/year is spent on Nathan’s time. For 2 years, I have been considering the total cost of Cognitive Revolution to be $750k with $500k in revenues, though I am very open to changing these numbers. I don’t know the best way to treat Nathan’s podcast given all these variables.






Very glad to see this analysis, I'm quite confident this is the most underleveraged area within AI safety. Big plus one on Manifund incubating more aspiring creators, it is incredibly hard to start right now due to the learning curve of YouTube algorithms, finding good editors, and writing engaging scripts.
Rather than using Rob Miles' channel as the reference point, my intuition would be using Anthropic's videos. They serve as a good base for knowledge about AI safety and work in geberal and makes more sense as a baseline than a channel that is at the top end of the distribution curve. They also represent what the default comms would be around AI safety if no individuals/groups were receiving funding or taking a risk with their own funds to start channels.
Would also be interested to see doom debates added.
Thank you both for doing this, I appreciate the effort in trying to get some estimates.
However, I would like to flag that your viewer minute numbers for my short-form content are off by an order of magnitude. And I've done 4 full weeks on the Manifund grant, so it's 4 * $2k = $8k, not $24k.
Plugging these numbers in I get a QAVM/$ of 389 instead of the 18 you have listed. (Google sheets here: https://docs.google.com/spreadsheets/d/1qUrr9JSjip7fzQfa0lBCfstUlsfTRI0Bs5V2cpJKmfY/edit?usp=sharing).
Other data corrections:
- 1. You say that the FLI podcast has 17M views and costed $500k. However, 17M is the amount of views on the overall FLI channel. If you look at the FLI podcast playlist [1] and just add up the numbers you get something closer to 1M (calculation here [2]). I'm assuming the $500k come from the podcast existing for 5 years and costing ~$100k / year? If so this does not include the three most viewed videos (that probably required a large budget) about slaughter bots and nuclear war (~12M of the 17M views). So really the google sheets should say 1M views, and the viewer minutes should be updated accordingly. (Except if they managed to produce all the slaughter bots / nuclear war stuff with a $500k budget).
- 2. Now, regarding watch time, saying that podcasts have a 33% watch time is I think overly optimistic. To give you some idea, in my case a good 12m video with 40k views has ~40% watchtime. And my most viewed podcasts average [3] 12% of watchtime. So I'd say you're probably off by a factor of 3 for podcast viewer minutes.
- 3. Finally, for a couple of these podcasts the views are inflated because the podcasts are promoted via paid advertising. Some creators are quite open about it so you can ask them directly and they'll tell you. If you really want to know if the views are inflated one way to determine that is to look at the likes / view ratio. For instance, if a podcast has 20,000 views but 50 likes, the ratio is 0.25%. This is 20x to 40x too low (cf. here [4]) for a non-promoted youtube with the same amount views. And if you look closely at a couple of these podcasts you listed you'll find exactly that. There is no problem in doing that (since it probably helps growth, and it's good to spend money to have more eyeballs), but if you're spending money to get views that are non-organic and viewers don't end up liking your video / engaging (and thus not driving organic reach) then your QAVM/$ is basically the cost of ads on youtube.
If we now look at youtube explainers:
- You end up ranking Rational Animations at 8th position in QAVM/$, despite it having the most views. I find this quite surprising because in my opinion RA is the highest-production channel on the list, on par with AI in context. One factor is that you rate RA's quality of audience as 4, compared to 12 for Robert Miles. I understand this is because you had lots of conversations where Robert Miles name came up and people said he had influenced them.
- However, I think Robert Miles' name comes up first in in people's minds primarily because he was the earliest AI safety YouTuber. He is also one of the only channels on the list that has his face in videos. So it's not surprising to me that many people in the community learned about AI Safety through him. The 3x score difference (in quality of audience) with RA seems too high.
Regarding your weights, you place both my TikTok and Youtube channel at 0.1 and 0.2 in quality, which I find surprising, especially Youtube:
- 1. On Youtube, my second most watched video [5] is an interview with Robert Miles, so it would be hard to argue that my content is 60x lower quality than Robert Miles’ own videos.
- 2. Similarly, for Cognitive Revolution, we were supposed to cross-post my Evan Hubinger [6] interview to his platform, and ended up crossposting my Owain interview instead, so can we really say that his content is 12x higher quality if (in some cases) Nathan would be happy for the content to be literally the same?
Overall, I’m a bit disappointed by your data errors given that I replied to you by DM saying that your first draft missed a lot of important factors & data, and suggested helping you / delaying publication, which you refused.
[1] https://www.youtube.com/playlist?list=PLpxRpA6hBNrwm43A7pKMEHvROr3vXM_eV
[2] https://docs.google.com/spreadsheets/d/1GXIZ-LtBML9mhYp4m3b3I-gqG8sCja9hexrydNgqWuY/edit?usp=sharing
[3]https://docs.google.com/spreadsheets/d/14gvoNEEpyoLK7n21XeJ6vZuSVZLy0sW6oxjllDYgGuY/edit?usp=sharing
[4] https://www.upfluence.com/influencer-marketing/average-engagement-rate-on-youtube?utm_source=organic&utm_campaign=direct&utm_medium=organic
[5] https://youtu.be/DyZye1GZtfk
[6] https://youtu.be/S7o2Rb37dV8
[7] https://youtu.be/eb2oLHblrHU