Oliver Gilan

I Was Wrong

Notes on the rise of AI and large language models.

I’m pretty proud of the fact that last year around October I pulled out of my crypto positions and largely switched my stance regarding the technology. For months at every party and get-together with other tech folks I was the only one in the room voicing my skepticism about the industry. I have been vindicated for the time being but the victory is sour because as I was excitedly exploring crypto to its natural conclusion I was neglecting and horribly mistaken about an adjacent industry: AI. When AI started to build hype around 2012 with things like IBM’s Watson and then Google Deepmind it was very exciting. I did Google’s machine learning course and learned the concepts behind those networks but came away unimpressed. It felt like powerful curve fitting with some very narrow applications but it certainly wasn’t anywhere close to general intelligence. Nevertheless the hype grew and startup after startup popped up with “AI” in the name promising magical results. Most of these startups weren’t even attempting to use neural networks to solve their problems– a fact that more than one founder has openly admitted to me– and the hype didn’t seem to be leading anywhere.

That doesn’t mean there weren’t cool results; there certainly were but they were mostly academic or they seemed to be indicative of Moore’s law and how far computing had come as opposed to some fundamentally new type of software. AI became a buzzword and anyone using it was either talking about social media recommendation engines, self driving, or straight up bullshitting. Sometimes all 3.

So I lost interest. It seemed like unjustified hype and then it seemed to plateau for a number of years and every time someone brought up the idea of AI to me I dismissed it and its potential impact. Candidates like Andrew Yang ran on a platform of increasing automation and how AI would replace jobs and I simply didn’t buy it. I still don’t fully buy it, for what it’s worth, but I now see a viable path to mass automation (whether this is catastrophic for workers or instead creates more abundance than humanity has ever seen is yet to be seen). Whenever it was brought up it was in the context of AGI or solving self driving, two problems that weren’t even remotely close to being solved but in dismissing those usecases I completely missed the slow and steady progress taking place in the field. Then things began rapidly and consistently changing with the release of GPT-3.

GPT & The Rise of LLMs #

GPT-3 was released in June of 2020 and it was obviously different than any other “AI” I had seen. It’s capability to respond to a variety of prompts and the quality of the responses was clearly on a different level even though it was still limited. It did fail for certain topics and was happy to spout impressive-sounding babble that didn’t actually mean anything and any attempt to have an ongoing conversation would quickly break down. Still, it forced me to step back and ask: is this going to change the world? My response at the time was no because of GPT-3’s obvious limitations and my implicit assumption that we were in for another decade-long plateau in the power of these models.

Then DALL·E was released at the start of 2021 and a year later DALL·E 2 and I was blown away by it’s ability to understand certain concepts and synthesize images based off that. It was the first time I began to wonder if maybe there was some fundamental understanding of concepts happening in these models. But I thought to myself: this is impressive but is it practical? How expensive is it to train such a model? How accessible is this tech really? But then Midjourney and Stable Diffusion followed quickly after which blew that notion away. Now ChatGPT– which is basically just the GPT-3 model with some filters and context tracking– has displayed a level of intelligence and understanding that I never thought I’d see from a machine in my lifetime. And to top it all off, GPT-4 is rumored to be almost ready and is expected to be as a big a leap from GPT-2 to GPT-3. It’s safe to say my implicit assumption of a plateau in AI tech was wrong and things seem to instead be speeding up.

Admittedly at the time of writing I don’t actually know how these models work at all. Back in the machine learning craze about 8 years ago I did some Google machine learning courses and learned the concepts around those systems. From what I’ve heard basically everything from back then is outdated and these systems operate under completely different principles. It’s very possible we’re not even scratching the surface of what these new models are capable of and this may just be the beginning of exponential growth in capability for LLMs. Such a thought is truly frightening.

It’s worth mentioning that I still don’t believe these models are conscious, whatever consciousness even is, but this brings me to my second mistake. I didn’t believe we were headed to general intelligence in the sense that we could create a computer that thinks like a human but that’s sort of missing the forest for the trees. If GPT becomes 10x more powerful than it is today it doesn’t really matter if it’s “conscious” or not because it’ll still be capable enough to perform a bunch of human functions and significantly affect the structure of society as we know it. In fact the whole question of consciousness has started to feel ancillary and while I have my theories around where consciousness comes from and how it can possibly be created in a machine it all feels largely irrelevant right now. It seems to be the case that consciousness is not a prerequisite for intelligence and instead the opposite is probably true.

Emergence of AI & Crypto #

As an aside, I find it amusing that we are seeing this emergence of AI in parallel with the emergence of crypto. Both technologies started with niche groups of highly technical individuals building new technology to solve a problem. During the last crypto cycle, though, we saw it really grab hold with “finance bros” and entrepreneur types; the type of person who likes the bling and flashiness of a place like Miami and envisions a future where they are the founder of the next big consumer social startup. This isn’t a knock against anyone of such description but the result was that everyone I knew in crypto fell into two buckets: extracting as much wealth as possible through financial market mechanisms like arbitration (or outright scams if they werent technical) or they were desperately trying to build the big usecase for crypto. Social apps, gaming, emerging market loans, ads, etc. Especially near the end it felt like everyone had an idea on how to make crypto useful in the real world and everyone wanted to claim that throne.

In contrast, the AI space still feels overwhelmingly technical. It’s as though the entrepreneur-lifestyle type of person who 6 years ago would have started a company with AI in the name just to attract investors has since pivoted to crypto and they haven’t yet pivoted back. Maybe they’re burnt out from going all in on an industry that doesn’t seem like it’ll pan out, maybe higher interest rates prevents this behavior, or maybe they just haven’t caught on yet to whats been happening with with these new models but everyone I know who’s interested in AI is interested in building the models themselves not in end-user applications. As a result I think there’s a lot of alpha to be had right now in building the actual customer-facing tools powered by the models instead. We will absolutely see a massive crop of AI startups in the immediate future but for now the low hanging fruit is all still there.

Pitfalls of AI Startups #

I do think there are some pitfalls to watch out for if you plan on creating a product built around AI. The first is with the models themselves. GPT is incredibly impressive yet obviously limited and prone to just making stuff up or spitting out senseless babble. It would be a challenge to trust its output with anything critical but also using an automated system to sanitize or correct the output is very difficult because it’s generally completely unstructured.

Secondly, and the biggest risk in my opinion, is the platform risk of relying on a company like OpenAI. We just saw how a company as big as Facebook could be brought to its knees because of the decision by executives at Apple. Building on top of GPT would represent an even bigger platform risk than that and absolutely needs to factor in to any business decision and it’s something that should scare the shit out of any startups trying to use OpenAI’s tools. The way around this is to only use open source models like Stable Diffusion and Midjourney– but there’s no guarantee there will be any comparable open source models for something like GPT-4 and later models– or to just accept the risk and dedicate a significant amount of revenue and resources to building your own models right from the start. I’m willing to bet we’ll see a lot of AI companies that end up being successful going that route. If you can start with GPT and build a valuable product and start generating revenue you can then use that revenue to build in-house models that let you deleverage the platform risk. That’s easier said than done, though. It remains to be seen how easy building such models will be. It’s possible that once it’s demonstrated that such a model can be built it then becomes easy to reproduce and that most of the cost is in the initial research and experimentation.

There’s a third less obvious risk but it’s that the space might simply be moving too fast to reliably build a customer-facing product. Maybe GPT-4 comes out and you immediately start building a product powered by it. A year later you’ve shipped the first version and acquire some customers but then another year later GPT-5 comes out and completely commoditizes your abilities. Would you be able to simply upgrade models? Are they plug-and-play? It’s probably not so straightforward which means that building right now introduces a risk of becoming outdated very quickly. This is a pretty unique risk but one that should be considered.

Evaluating Potential AI Startups Ideas #

One of the interesting phenomenons I’ve noticed with GPT and other LLMs is they are extremely impressive and yet it is very difficult to envision just how these models can be or might be used. The obvious low hanging fruit is there: AI profile pictures and picture book illustrations with DALLE, musical lyrics, poetry, different flavors of chatbots with GPT, etc. Those are certainly valid usecases yet none of them are venture-scale and I’ve had trouble of thinking about what can be done that is venture-scale. I’ve been using the following framework to evaluate what sort of products and businesses would work well with a GPT-like model:

Domain w/ Clear Boundaries Avoid completely open ended domains like “browsing the internet.” This just feels like a problem space that is too vast to reliably and consistently engineer these models to handle properly. If you are going to build an AI product to browse the internet instead have it focus only on a select number of websites and with a set number of actions. A digital assistant could reliably navigate to Google Calendar and manage a schedule. Then again, if it’s too simple of a domain then you probably don’t even need an LLM and would see better results with just plain ol' software. Going back to the previous example, once you’ve captured user intent it’s trivial to use Google Calendar’s API to just automate someone’s calendar without any AI involved.

Pick Your Edge Cases Wisely Any domain that is simple enough to not have any edge cases might be too simple to justify using something like GPT. Therefore you probably want to pick a domain with some type of long tail of edge cases but you should evaluate these cases very carefully and pick the problems where the edge cases aren’t critical. This is why I don’t like self driving as a problem space. There’s a seemingly infinite number of edge cases and getting it wrong on any of them can lead to catastrophic outcomes. Similar with something like a therapy chatbot. You only need it to mess up once and it creates dramatic harm.

Reduce Scarcity One funny observation about GPT, DALL-E, Stable Diffusion, etc. is that so far they’ve really only reduced the scarcity of things that are alread post-scarcity. Sure making digital art more accessible reduces its scarcity but it’s not like there was a drought of good artists. Similarly generating 1 paragraph blurbs isn’t exactly reducing the scarcity of anything. Even at its best generating song lyrics or poems etc. isn’t dramatically reducing the scarcity of whatever its generating. It’s not as though you can’t be successful making music using GPT and you can probably even make a GPT-based product designed to help musicians but it doesn’t feel like a very defensible position and it also won’t change the world. Of course, changing the world is overrated and not everything has to be venture-scale but for the sake of this excercise it’s worth keeping in mind.

So the real question going forward is: can this product effectively reduce the scarcity of something with extreme value? And can it do that by effectively navigating a clearly defined domain consistently enough with much greater efficiency than human counterparts? And when it messes up are those mistakes easy to spot and correct preferrably in an automated fashion? It’s definitely difficult to think about problems that fit all those criteria but they exist simply because of P != NP. That is, we currently have tons of humans doing jobs that are really tedious and manual and very automatable with AI and otherwise. Actually doing the work takes NP time but checking that the solution for the work is correct can happen a lot faster. So even if your product requires human oversight it can still potentially provide massive efficiency gains. Hint: I think paralegals and investment banking analysts fit this criteria perfectly and you could probably create a GPT-based product to have one individual do the work of 20 in these fields.

I will not personally be pursuing any ideas around GPT right now but one of my big goals this year is to better understand the tech behind these large language models. Currently my stance is that there’s probably a 30% chance that these models end up being a more impactful invention than the iPhone and the internet combined. There’s still a massive chance they turn out to be impressive and fun and even useful but not revolutionary and yet… 30% is quite a big chance to upend the world.