15 Comments

Also, pre-empting the haters. OpenAI is positive margin. It's expensive to serve inference, but you can still make money.

Expand full comment

Is there any source for this?

Expand full comment

Sf dinner parties

Expand full comment

> OpenAI is positive margin.

> Sf dinner parties

Well in that case, this statement also could mean different things. Are they positive margin wrt how much they charge for a request and the inference cost required to serve that request and completely ignoring the training cost?

Expand full comment

I think the training costs become negligible. Inference is way way more usage. Also, for a startup, having a 1 billion+ revenue product that is profitable on usage is nutzo.

Expand full comment
Nov 1, 2023Liked by Vikram Sreekanti

"per-token inference costs for fine-tuned GPT-3.5 is 10x more expensive than GPT-3.5 it is still 10x cheaper than GPT-4!"

this is off.

fintuned-GPT3.5 is 3x cheaper than GPT4 -- $0.0120 / 1K tokens vs $0.03 / 1K tokens

Expand full comment
author

Thanks for the catch! We definitely got our wires crossed somehow while writing this. We'll update the text of the blog.

Expand full comment

“Open-source models must get smaller over time…For everything else, the major LLM providers will dominate.”

I’ve been racking my brain lately trying to figure out where this leads. Open-source can find niches, I think - but man oh man the major LLMs are just so smart (for whatever definition you choose for that word). And smartness kicks ass.

Expand full comment
author

They are smart, but I think there are diminishing marginal returns to smartness. You don't always need the most expensive and capable model to do every task — you can have a specialized (fine-tuned) version of a smaller model to do other tasks. That's why we think that making OSS models smaller over time will help because it makes it more attainable for orgs to fine-tune and deploy those models.

Expand full comment

I would love to find - or create - situations out in the real world of orgs that find that sweet-spot mix of “genius model does these hard things” plus “specialty model does these specialized things”.

If you know of or encounter those, shout them to the world.

Expand full comment
Oct 13, 2023·edited Oct 13, 2023

I use both OpenAI (as a paying subscriber) and I also host multiple open source LLMs locally.

The privacy aspect of the local models is definitely nice, but if you use OpenAI also, you’re unavoidably blown away at the sheer speed of the responses. It’s hard to beat.

Expand full comment

This assumes that OpenAI's pricing is sustainable, and not a land/developer grab. OpenAI wants you to abandon doing it yourself by subsidizing infrastructure (for now). That will end. Not to say that AWS is the cheapest way to do this - but when the industry is committed to losing billions of dollars to stake out turf, this is what you'll see. You definitely can't extrapolate this over time.

Because OpenAI's cost of running the actual infrastructure isn't going to be that much different than AWS'.

I also think if you upped the parameters significantly, it's a less rosy picture.

Expand full comment

That, or the $11.3 billion in funding through 7 funding rounds from 14 investors they've got lets them light a nearly limitless amount of money on fire in the hopes that in a few years all these other companies offering GaaS (GPT-as-a-Service) will be in too deep to stop paying when they jack up the price.

Expand full comment

Smaller specialized LLMs running locally may be the future both for companies and individuals who prefer to keep IP and other information a secret. This message written by a human.

Expand full comment

OpenAI squeezing the middle from both sides. The best capabilities and the cheapest to try your first experiment with.

Keep the back of the envelop math for LLMs coming!

Expand full comment