GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

sanitation@lemmy.today · 2 days ago

GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

SirEDCaLot@lemmy.today · 2 hours ago

Ever run an AI model locally? If you want the most capability you need a fast GPU with 32-48gb RAM. And that’s all for you, ONE user.

Copilot has millions of users, with tens or hundreds of thousands of them hitting the AI all at once. Each one needs $thousands worth of GPU and RAM dedicated to them for the length of their query processing.

Where do you think the money to buy all that hardware comes from? You see OpenAI buying a double digit percentage of the world’s RAM production, you think they got it on clearance sale?

No, there are investors. Investors who are pouring hundreds of billions into this AI stuff. And they don’t do this because it’s fun, they do it because they expect a BIG return.

So what’s going on is just like your neighborhood drug pusher, only the drug pusher is more honest. He says ‘first hit’s free, man’. AI company says ‘AI models are an easy and cost effective way to modernize your workflow!’; they don’t tell you that once you’ve integrated them and fired all the humans who know how to do the work, the price is gonna go way up.

Because the fact is, there IS a real cost of AI compute. GPU time, or at the large scale, datacenter space, power, cooling, etc.

In another few months to few years, the C-suites will stop huffing the koolaid and will start doing cost-benefit analysis on where AI is and isn’t cost-effective vs. humans. With any luck (for the AI people) by that time the AIs will be good enough that it’s a clear benefit. If not this bubble’s gonna pop.

T156@lemmy.world · 1 hour ago

Ever run an AI model locally? If you want the most capability you need a fast GPU with 32-48gb RAM. And that’s all for you, ONE user.

Even then, that’s quite small. Top of the line frontier models would be looking at hundreds of gigabytes of video memory, and just as much RAM.

A terabyte of VRAM/RAM needed for something like CoPilot is probably a fairly sensible estimate.

phx@lemmy.world · 55 minutes ago

Depends on what you want to do, the model, and optimization or quantization.

A lot of LLM stuff that seemed pretty amazing a few years ago - chatbots and the like that respond to questions in plain language - can run in comparatively light hardware. Coding agents can take more, but could also be optimized against a particular language and spit out useful snippets.

Image stuff can be pretty complex especially at higher resolutions and detail, and creating seamless video segments gets expensive on hardware, fast.

heartSagan5@lemmy.zip · 2 hours ago

So, what you’re saying…is the AI Bubble is going to pop once the pencil pushers do the math? But they’re asking their local LLM for that… so it isn’t happening?

SirEDCaLot@lemmy.today · 1 hour ago

Not pop. Correct.

A lot of the managers aggressively pushing AI have little or no understanding of it themselves. They just hear of a technology that can make a human more productive by doing most of the work for them. So absolutely that’s worth a ton of money. It’s why many companies are encouraging if not demanding employees to start using AI- because in their mind, one employee fully utilizing AI can do the work of two standard employees. Of course they believe this because they’ve never actually had to use the damn thing themselves and thus don’t realize it doesn’t do all the work for you. Or worse they think it does and your wonderful code base turns into spaghetti.

Side note- A few companies even had leaderboards for who was using the most AI tokens. This led to ‘tokenmaxxing’, trying to consume as many tokens as possible to prove you are adopting AI. Things like 'Write unit tests for our company code base, then refactor the code base. Spin up an instance of Claude and another of ChatGPT to each generate unit tests of the old code and run them against the new code, then run the tests against each other to check each other’s work, submit full debug output to another instance of gpt 5.5 that will check for hallucinations… Keep that query going for a few paragraphs and you’ll have an army of AI workers all checking each other’s work while producing zero productive output but costing a fortune to run.

GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

Just a moment...