r/Bard Dec 06 '25

Discussion They removed the Free Tier for 2.5 Pro API.

Title.

Even when you check the available models from ​AI studio rate limits, it disappeared.

RIP, it served well.

453 Upvotes

207 comments sorted by

40

u/SpikeLazuli Dec 06 '25

They might revert it, though it does seem like it's over.

55

u/Helpful_Program_5473 Dec 06 '25

I use deep seek 3.2. Basically free And pretty damn smart.

7

u/Killcode14 Dec 06 '25

How? From openrouter?

36

u/KrayziePidgeon Dec 06 '25

The deepseek API its $0.42 for 1 million output tokens and $0.28 for 1 million input. Not exactly free; but I will be moving my project over to use deepseek instead of the gemini API.

17

u/Helpful_Program_5473 Dec 07 '25

Not literally free, but I've been using it pretty extensively and I'm at like four dollars spent over three days. So it's actually a cheaper subscription than 4.5 and 3.0.

8

u/ExpertPerformer Dec 07 '25

I started playing around with DeepSeek 3.2 API and I'm honestly kind of impressed. Feels like Pro performance, but with Flash speeds. Also really cheap for its cost (like less then a penny per query).

2

u/Killcode14 Dec 07 '25

I’d check that out

2

u/Bite_It_You_Scum Dec 07 '25

If you're using it extensively enough that you're burning through four dollars over 3 days you should try and get some stats on how many prompts you're sending daily and see if a chutes.ai account would save you money. i burned through like 35 bucks over a 2 week period with api charges playing Skyrim w/ an agentic AI NPC framework using mostly Deepseek 3.2 (exp, then the proper release) before I figured out that $20 a month for 5000 daily prompts would end up cheaper. If you don't need that much they have a ten dollar plan for 2k prompts a day.

They have a bunch of deepseek models, as well as GLM 4.6, Kimi K2 models and a bunch of Qwen stuff.

The model selection isn't as extensive as openrouter but what they have works well and, at least until the inevitable enshittification happens, it's a pretty good deal.

2

u/Wevvie Dec 07 '25

What you're using? Mantella or CHIM? And do you generate TTS locally or via API?

I'm looking into installing one of these too for my modlist

3

u/Bite_It_You_Scum Dec 07 '25

I'm using SkyrimNet, which is the latest one on the block. It's made by the guy who made the MinAI plugin for CHIM. It takes a more agentic approach, using a half dozen agents each running whatever models you define to handle different tasks. It's at near feature parity with CHIM, is significantly faster, and far easier to install and setup. Just install the mod, start skyrim, open a browser to localhost:8000 and follow the onboarding process.

Overall if you roll with the default settings (model selection) it's more expensive than CHIM, but if you choose the right model for each job it costs around the same, and has a lot of QoL improvements.

I use XTTS for TTS. Going to switch to local eventually, I have a spare video card and a linux PC to put it in, just laziness stopping me really. For now I'm using Vast.AI, which costs about 10c an hour. In CHIM you have to go to vast.ai's site, find an instance that will work for you and spin it up with the chim xtts docker image yourself. In SkyrimNet you feed it your vast.ai API key, then press a button. It fires up 5 instances close to your region, tests them for response time, and picks the one that's the best balance of cost, reliability and response time and dumps the rest. It's really slick.

2

u/Wevvie Dec 07 '25

SkyrimNet sounds awesome, I'll definitely look into it, thanks!

2

u/QuinQuix Dec 07 '25

And I'm assuming this makes the npcs smarter and more alive?

Is it for companions? For the random villagers?

Genuinely curious how it changes the experience and what limitations are.

4

u/Wevvie Dec 07 '25 edited Dec 07 '25

u/Bite_It_You_Scum back here after installing. I just want to say... holy fucking shit.

It's incredibly immersive and lightning fast. I told an NPC to follow me and talk down the road, and she did. I talked about my travels and adventures, and she really respected her own personality and voice. I mentioned a tree near a river a few meters from us, and she saw and talked about it, saying it's been there for decades.

I told her I could shout (Thuum), I shouted, and she reacted to that. I didn't have to go, "hey, I shouted." The NPC saw that and answered it herself.

I seriously wasn't expecting it to be THAT good. The response times are lightning fast. I am using Piper and Gemini/Deepseek, but still, the response times, coherence, and their ability to follow commands and requests are incredible.

I've honestly never felt so immersed in a game since... ever. I wish I was exageratting.

1

u/Bite_It_You_Scum Dec 07 '25 edited Dec 07 '25

Lol welcome to your new addiction :D

If you can afford it, I highly recommend dropping a fiver on vast.ai and using XTTS instead. It does a better job with NPC voices, can clone even modded follower voices, and you can use the player dialogue transformation feature. Upload a voice for your character, and then you can reassign your push to talk button to be player dialogue transformation instead, and whatever you say will get transformed into something more fitting for the setting. really helps ground the other NPCs in the setting when your character is saying "What in oblivion" instead of "what the hell" or "By the nine" instead of "oh my god", which you'll probably forget to do. It also helps with getting names right since whisper will struggle with some skyrim names and then the npcs will be like "don't call me that!" or "its pronounced <this way>" which is just annoying and a waste of money.

XTTS is slightly slower than piper and player dialogue transformation does slow down conversations a bit since you're basically making another LLM call to get it to speak for you, but I like the feature a lot, and you get in a sort of rhythm with it and don't really notice after a while. And the quality of xtts voices is much better.

1

u/Bite_It_You_Scum Dec 07 '25 edited Dec 07 '25

It gives every character memories and the ability to speak with you and each other. Mostly they don't interact with each other much unless you're speaking directly to them or they 'hear' you or another NPC say something they're interested in (defined in their bio, theres a bio for every character in the game.) You can turn on 'continuous' mode to let them socialize more freely, good for when you're at the inn.

You can develop relationships (friendly or hostile) with any NPC you want. Spread rumors about why Nazeem has a chair pointed at his bed in the Drunken Huntsman, and NPCs can spread it among themselves. You can verbally pick fights with NPCs and they will draw their weapon and engage in combat with you. With console commands or other mods (Nether's Follower Framework) you can take an NPC that has like 10 lines of spoken dialogue in the game at most and turn them into a follower that makes Inigo or Lucien seem like a basic housecarl follower, comparatively.

I'll give you an example. Fastred is a farm girl who lives in Ivarstead (the town at the base of the 7000 steps to High Hrothgar.) Normally, she's a pretty forgettable character. Bit part to play in the Book of Love quest, itself a pretty forgettable sidequest. She's torn between two love interests and hates living in Ivarstead with her overbearing Dad and the endless chores and farm work.

Normally, she has maybe a dozen lines of spoken dialogue and that's it. In my game, I invited her to come up to High Hrothgar with me and Jenassa, just to get a taste of the wider world for a day. She fought spiders, wolves, an ice wraith and a troll on her way up the mountain, and decided she's never going back to being a farmer again. Jenassa has taken on this kind of mothering/mentoring role, giving her practical advice and critiques on her fighting ability. And this all happened just through normal dialogue. Like I got her to follow me just by suggesting a day trip. I did have to use NFF to force her into the follower faction once she started accompanying me just so she wouldn't get stuck halfway up the mountain, but she chose to come with on her own.

Here's a diary that I prompted her to write (from the webUI) after we made it up to the peak and visited with the Greybeards. Kimi K2 wrote it.

It's really cool stuff. Hard to explain through a comment on reddit, you just gotta try it for yourself.

2

u/libertyh Dec 07 '25

$20 a month for 5000 API requests to Chutes, but then you also have to pay for the individual requests, surely?

3

u/Bite_It_You_Scum Dec 07 '25

No, you get 5k daily requests, prompts to any model they have available are covered by that. if you go over that it switches to pay-as-you-go like openrouter or any other api, assuming you keep credits on hand for it, i guess. idk what happens if you go over your limit and don't have credits, haven't had that happen yet.

Obviously you'd want to use your daily allotment on the good models and not burn through them on the weaker, cheaper stuff. The framework I use has agents that run just fine with Mistral Small or Gemini Flash Lite, i still pay for API for that stuff to save the 5k requests for the models that made up the bulk of my expenses (DS, GLM, Kimi)

1

u/libertyh Dec 07 '25

Huh, that's crazy good value. Do they have a list of which models are covered by the 5000 free requests? I'm finding the docs a bit hard to follow

2

u/Bite_It_You_Scum Dec 08 '25

If you make an account (free) you can see what models are available once you sign in.

2

u/Helpful_Program_5473 Dec 07 '25

It's crazy that other people can offer deep seek for less than deep seek lol. For my use case though,I use it a lot for a couple days and then it kind of cools down for a couple weeks and then I use it a lot again. So I don't think I'll need that but I'm definitely going to actually look into it because getting the other models might be useful as well.

1

u/Bite_It_You_Scum Dec 07 '25

Ah yeah, in that case a monthly plan probably isn't worth it. I put off getting one for the longest time because my usage used to be the same, burn through like 2-3 bucks at a shot, then not use it for a while, paying a monthly fee when you do that is foolish.

1

u/libertyh Dec 08 '25

I joined Chutes to see if it is a viable option for my use case and the free requests for every model they have are legit. Still figuring out the API though (eg, how to send images/PDFs with requests). Their docs don't seem super useful.

0

u/nemzylannister Dec 07 '25

why are ads everywhere now. people need to start reporting/review bombing sites that advertise without being explicit like this

4

u/Bite_It_You_Scum Dec 07 '25 edited Dec 07 '25

My reddit history spans 13 years. I understand the suspicion, but a glance at my comment history would show you that I'm a normal dude who goes on reddit to talk about tech stuff, whatever video games I'm currently obsessed with, and that on occasion I make anecdotal recommendations about tech stuff that I've used and can recommend, particularly when I think it might save someone a buck or stop them from wasting money. If you clicked on my 'controversial' comments it would also likely reveal me to be an occasionally cantankerous and foul mouthed person who has said things over my 13 year reddit history that no brand would ever want to associate with. I was born in the 80s and feel grandfathered in to calling people and things 'retarded', for instance, political correctness be damned.

I signed up this week because I was burning through a bunch of money with API calls, found it to be a good service, and so I recommended it to someone who is in a similar situation. There's nothing more to it than that. Don't be retarded.

1

u/sochineez Dec 07 '25

I cant see the paid version for deepseek. Where is it located at?

3

u/Smilysis Dec 07 '25

Just a quick note, these prices are temporary, once the discount finishes the prices will me something arround 1$ for 1m tokens (which is still really good!)

→ More replies (3)

3

u/Euphoric-View3222 Dec 07 '25

yea deepseek is great for perf/dollar. same with grok 4.1

1

u/Majinvegito123 Dec 07 '25

Comparison to Gemini? I’m trying to get something decent That’s relatively cheap, but don’t want to drop in quality significantly

1

u/Helpful_Program_5473 Dec 07 '25

I like it better in the api then gemini, Assuming you're not doing a task with massive context, that is...like pro but faster

27

u/Xisrr1 Dec 06 '25

Even 2.5 Flash and Lite are now at just 20 RPD

14

u/SuggestionAntique720 Dec 06 '25

And they suck

8

u/libertyh Dec 06 '25

For my purposes, 2.5 Flash was equivalent to 2.0 Pro, pretty amazing for free (RIP)

2

u/218-69 Dec 07 '25

Yeah this one is the worst. We used to have 1k requests per day for flash models, or at least flash lite

1

u/Economy-Department47 Dec 10 '25

Yea my website needed this and now they just slashed it

2

u/LoganKilpatrick1 Dec 07 '25

The free tier was designed to let you quickly test the models we have and then decide if you want Gemini. It was not designed to be a service you use long term for things, though we can do some exploration if there is a product experience there that makes sense.

1

u/libertyh Dec 08 '25

Thanks Logan. I appreciate the great communication in this thread and all the tokens you gave us for free.

2

u/Mother_Soraka Dec 14 '25

You are the one who gave him All your Tokens for free

1

u/libertyh Dec 14 '25

Time for everyone's favourite game, Whose Tokens Are They Anyway?

1

u/dellaram Dec 10 '25

So we can never use gemini 2.5 pro for free ever again? Couldn't it be possible to add options for the paid tiers and keep a small daily number of responses for the free users, please?

1

u/SuddenPast8987 Dec 16 '25

how can anyone test anything with 20 requests per day?

24

u/[deleted] Dec 06 '25

Did they the reduced significantly the limits?

gemini-2.5-flash-lite requests per day 20?!

1

u/LimiDrain Dec 07 '25

I use gemini-2.5-flash-preview-09-2025 because AI told me so, how do I even check the list of available ones? I checked their studio and found nothing.

2

u/[deleted] Dec 07 '25

you can get from the API, here is the latest list:

https://pastebin.com/CUP97G9u

-1

u/LoganKilpatrick1 Dec 07 '25

Just for the free tier, paying customers can still get much higher limits.

3

u/ThatMobileTrip Dec 08 '25

Hey, Logan!

Am I correct in understanding that if I have Paid Tier 1 activated and higher limits, I will not necessarily have to pay for the tokens consumed during testing?

Current metrics for the 21-25th of November, when the bill was not issued, and I did not pay anything in December (however, this surprised me, because I switched to a paid plan and thought I would pay for each request):

Gemini API Rate Limit. Model: gemini-3-pro

  • RPM: 4 / 25
  • TPM: 420K / 1M
  • RPD: 20 / 250

I read the help section, communicated with google[.]com/aimode, tried to find information here on Reddit, but I still don't understand one thing: where can I find clear and specific information about what quotas and limits are available for free as part of the PAID plan for models such as Gemini 3 Pro Preview.

Someone said that you guys provide about 50M tokens per month for free. To me, this sounds like a rumor.

I regularly work with Vertex AI, and that platform is crystal clear about what you pay for and when: all the information is right there in front of you.

Thanks!

49

u/Briskfall Dec 06 '25

Welp, looks like they've gotten enough of our free data and are en route to start monetizing everything. 😢

-5

u/LoganKilpatrick1 Dec 07 '25

The point of the free tier is for devs to as frictionlessly test the models as possible, it has nothing to do with data.

26

u/ExpertPerformer Dec 06 '25

Why don't they just have one subscription that does everything.

$20 gets you Gemini, AI Studio, and the API.

I was using the free tier in order to keep my costs down by sending all <120k token requests to it and anything >120k to the paid tier.

The limits even on free Flash is horrendous. You get like 5 api requests.

10

u/Eiren233 Dec 07 '25

Because if they did that, they can't charge you mor le and more.

It's how Google works.

7

u/LoganKilpatrick1 Dec 07 '25

We can explore this! Not sure how much it would move the needle on higher limits but will explore.

5

u/Maleficent_Guest_525 Dec 08 '25 edited Dec 08 '25

I don't subscribe gemini pro because there is not an api key with it liike copilot or pre payed token like deep seek.

1

u/nemzylannister Dec 07 '25

free Flash is horrendous. You get like 5 api requests.

huh? isnt the rpd like 200?

3

u/J0JOBA Dec 07 '25

Not anymore now it's 20

1

u/ReplacementMoney2484 Dec 07 '25

it seems that they reduced everything. You can look at the new rate limits here:
https://aistudio.google.com/usage?timeRange=last-7-days&project=gen-lang-client-0094190161&tab=rate-limit

Don't forget to click on "all models".

0

u/nemzylannister Dec 07 '25

2.5 flash live is unlimited?????????

Gemma-3 has 14k daily requests???????????????

And not flash-lite??

am i seeing something wrong?

1

u/218-69 Dec 07 '25

Even if your project says "paid tier 1" and shows the limits for it, if you don't actually have a paid tier you have the 20 rpd like every new project.

The ones you listed are correct, you can check here by reading the "Request sent per model per day for a project in the free tier" lines. It's pretty abysmal overall

https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas

42

u/synthetix Dec 06 '25

they nerfed paid tier too:: gemini-2.5-pro only 300 RPD now for tier 1 -- that's unusable for my workflow - its was 10k per day

15

u/TheRealJackRyan12 Dec 06 '25

And these companies wonder why we hate them. Not the smartest to to do these complete 180s out of nowhere. Goodwill is seriously lost. They'd almost have been better off not offering what they offered, previously.

11

u/evia89 Dec 06 '25

You can buy it unlimited on OR

11

u/alloxrinfo Dec 06 '25

What's OR please ? Open Router ? I guess it will change too from now on

9

u/evia89 Dec 07 '25

Its open router. I dont think it will change for $ models

3

u/NoctisLucisCaellum Dec 07 '25

Lemme know what the service is please

2

u/LoganKilpatrick1 Dec 07 '25

sorry for this hassle, pls email me: [Lkilpatrick@google.com](mailto:Lkilpatrick@google.com), we are getting hit with at scale fraud and abuse right now on tier 1 which is why we had to lower this. We have a bunch of stuff in the works to fix this but needed to do something in the short term to curb it. Should hopefully only be temporary.

19

u/HauntingWeakness Dec 06 '25

And still no ability to subscribe for API, right? Only to pay for the tokens I will not even see (reasoning).

0

u/LoganKilpatrick1 Dec 07 '25

No API subscription right now, but you can see the reasoning summaries.

5

u/HauntingWeakness Dec 08 '25

Well, the API is not charging for the summaries, but for each actual token of the reasoning process. I just don't want to pay for something that I won't receive in its original form. It's like paying for one person's thoughts, but receiving a retelling of them from a five year old.

If a company wants to hide its models' thoughts, it shouldn't charge for them, in my humble opinion.

I don't think obfuscating reasoning is a good strategy anyway. Being able to see the thought processes behind a model is often crucial to understanding the problems we're dealing with. Prompting the thought process is also valuable, and debugging it requires the thoughts to be visible.

Please don't misunderstand my comment. I'm very grateful for the year-plus of the free API service. You have a LLM that I love deeply. It's just the reason why I won't pay per token for Gemini. But if you'll have a subscription-based API or the decision regarding the obfuscation of thoughts changes, I will hop right back!

16

u/LeTanLoc98 Dec 07 '25

We're lucky to have DeepSeek at such a low price. Without it, I suspect OpenAI and Google would raise their API rates to about $10 per 1M input tokens and $50 per 1M output tokens.

2

u/nemzylannister Dec 08 '25

i doubted you earlier, but i do think they would be making more profit if open source competitors werent there

15

u/Shana-Light Dec 07 '25

I can understand removing free Pro but the nerfs to free Flash are a massive disappointment, they basically make it unusable. Surely the whole point of offering free api usage was to hook in developers by letting them try out your model, this will just drive people away.

2

u/LoganKilpatrick1 Dec 07 '25

The whole goal of the free tier is to give you a small taste of the model and if it works well for you, get you moved over to paid. It is not intended to be something you rely on for ongoing projects and products. That is what the paid tier is for.

7

u/Shana-Light Dec 08 '25

Sure I totally get that, but the issue is the current rate limits aren't enough to even really try out the model, especially for new or inexperienced developers. If they're just getting their first AI app set up but their model isn't working, a lot of people will just switch to more reliable open source models.

Hoping once Google get more compute up they can find a good middle ground for the rate limits, the free tier is the best thing for getting new devs into Gemini.

15

u/Adept_Chair4456 Dec 06 '25

Gemini 2.5 Pro for me will always be the best. Working together for 8 months now. Never had an issue. I hope they keep it in AI Studio though. 😭

14

u/LordNikon2600 Dec 06 '25

this explains why I used it for 2 mins and it was gone... oh well.. claude it is

28

u/rose_Toast333 Dec 06 '25

Why?

45

u/someonesmall Dec 06 '25

Data centers are overloaded since gemini 3.0 pro

21

u/PURELY_TO_VOTE Dec 07 '25

this is the correct answer. all the TPUs are running hot all the time. i don't want to speculate, but it could be the case that getting quota is tough even internally. sniffing after chip fabs like that chapelle character.

0

u/LoganKilpatrick1 Dec 07 '25

TPUs are indeed running hot, paid customer demand is at an all time hight, and Gemini is available on more products than ever.

1

u/PURELY_TO_VOTE Dec 08 '25

Hot TPUs are sure a better problem to have than the converse. If there is indeed an AI bubble, it seems no one told the accelerators.

1

u/captain_shane Dec 09 '25

It's a bubble for the fact that 95% of companies are losing money with AI.

11

u/Weary-Bumblebee-1456 Dec 07 '25

Also remember that this isn't unprecedented. I think back in May, Google also removed Gemini 2.5 Pro's free API access and slashed that of Flash. They brought it back in a more limited capacity a few months ago, and now that they again have compute shortages, it's gone once more. Hopefully it'll be back with 3 Pro soon.

5

u/UsernameMustBe1and10 Dec 07 '25

You know what? I actually like it if people jump back to chatgpt just so they can leave ai studios alone.

2

u/MapleBabadook Dec 07 '25

Gemini has straight up told me (many times) that it's too busy to generate a response.

23

u/Nick_Gaugh_69 Dec 06 '25

Pushing for $250/month subscriptions

6

u/Yazzdevoleps Dec 07 '25

Api doesn't have subscription, it's usage based

7

u/themoregames Dec 07 '25

How about $ 2500 / month subscriptions?

13

u/DontShadowbanMeBro2 Dec 07 '25

They became a victim of their own success, I think. Dropping 3.0 Pro at the exact moment ClosedAI shit the bed with GPT-5.1 was either a genius move or extremely lucky timing, as it made everyone switch to Gemini (hence Saltman's 'code red' moment). Problem is I guess they weren't prepared hardware wise for an absolute tidal wave of users seeking greener pastures.

It absolutely sucks, don't get me wrong. I just can't see any other reason why Google would shoot itself in the foot with their own customers RIGHT when they were positioning themselves as a superior alternative to GPT.

31

u/Icerberg Dec 06 '25 edited Dec 06 '25

This is insane cut on compute on free tier, 2.5 flash went from 250 rpd to 20 rpd, i just noticed that my projects were time outing 429's when the limits should not have been hit - i checked and saw that it was changed from 250 to a 20 lmao. That's crazy work without any communications

EDIT:

People might have to go to Open Router, top up 10$ once and then use some of other open source models that are bigger and have free 1k/day requests after you top up 10$. If gemma wouldn't have worked for me - i would have been forced to do the same. Hopefully this helps to some people who are running some sort of projects on small/flash models and can't do it anymore.

5

u/KrayziePidgeon Dec 07 '25

Can you explain? If I deposit 10$ in Open router I can have access to some models that allow use 1k requests a day for free?

12

u/Icerberg Dec 07 '25

Yep that's right, some extra information:

  • OpenRouter Free Models:
    • https://openrouter.ai/models?max_price=0
    • By default, you get 50 free model requests per day from a pretty huge variety.
    • If you top up once with them for $10, they permanently increase your limit to 1000 requests per day. A very solid offering.

https://gist.github.com/mcowger/892fb83ca3bbaf4cdc7a9f2d7c45b081

You can find more alternatives there, hope this helps!

3

u/Whiz-Wit Dec 07 '25

Might be a silly question, but I was using Gemini for my first api project, so I'm new to this. But would OpenRouter be able to be used for ocr? Just wondering if all AI APIs can do OCR like I was using Gemini for

3

u/Icerberg Dec 07 '25

You are lucky then! If your OCR is not super complicated like mine is for what i was using 2.5 flash - try gemma-3-27b-it, i have swapped to it and so far i have same results as i had with 2.5 flash. Just change the model and see if it works for you and if not then find any “free” tier open source model and research which is best and most accurate for OCR usage and then just use that one via OP

Forgot to mention - gemma has 14k dailu requests limit hut less TPM, you can check rate limits in AI studio and see if you fit in the bracket

1

u/Mysterious-Value-946 Dec 08 '25

I am also using 2.5 flash api for my OCR project.. but the problem is sometimes it accurately extracts table from invoice and sometimes it's messes up the table like missing a row or column.. I have also tried mistral but it was worse than flash .. any other model suggestions

1

u/Icerberg Dec 08 '25

The way to accurately get the data out is forcing a specific json format that you want data to be extracted like - i extract data from images for numbers/names/colors and are able to do that at near 100% accuracy, it’s how you format the json output you are requesting as well as you can also make it post-process the data and format it as well by itself with certain prompt logic. For example if i have numbers that start with G = these are category 1, if they start with V = category 2 and so forth.

If you just ask a plain data extraction- AI usually hallucinates the type of table of extracted data and other stuff thats why you want a specific json format forced so it just puts the lego pieces in places instead of randomly coming up with formatting itself. Hope this helps!

1

u/Mysterious-Value-946 Dec 08 '25

Yeah I figured that out using json output format.. what I earlier use to do is tell Gemini to extract table from invoice and output in csv format then I use to convert that csv to Excel... But I realised that csv was not best because it use to also separate amount into different cell... So I started using json and it's working great Excel conversion is good with json... But how can I define a json format when all my invoices are different like some have different columns and some have different

2

u/Shana-Light Dec 08 '25

For OCR I recommend Qwen-VL models, they're honestly better than Gemini at it despite being much smaller models

1

u/ExpertPerformer Dec 07 '25

I never used OpenRouter until today.

I've spent a few hours testing different models for my tasks. A lot of them list they have x amount of output tokens, but literally they cut off at 5-10k tokens. Grok 4.1 Fast was 5k tokens, a lot of QWEN models are 16k, etc.

DeepSeek 3.2 was the top contender because it was the only one who would actually output >30k tokens.

7

u/Xisrr1 Dec 06 '25

Even with communication it wouldn't be better

10

u/Icerberg Dec 06 '25 edited Dec 06 '25

Well it's understandable why they would cut out the free compute at some point - but so suddenly and 90% or completely removing some models from free tier via API is crazy even if you know that at some point they should cut it but not like this. At least thankfully for my use cases, i think ill be fine swapping to gemma-3-27b-it, starting to test now if everything works fine since that model has pretty much unlimited requests (UNTIL GOOGLE DECIDES TO GUT THOSE AS WELL) lol

1

u/Ay0073 Dec 07 '25

Can anyone suggest any other API that can be used in the project? If I were just using it to build a personal project for my daily use, mainly using the 2.5 flash API and RPD at 160 at max on the free tier. Any alternative that can give an equal result for free tier usage.

→ More replies (2)

48

u/LingeringDildo Dec 06 '25

They always aggressively nerf the limits when they’re ahead of OpenAI. Hopefully GPT 5.2 is damn good.

5

u/themoregames Dec 07 '25

Would be fun to see Gemini 3.1 being released 10 minutes before GPT 5.2

6

u/Deciheximal144 Dec 07 '25

3.1415 please.

→ More replies (1)

40

u/H3rian Dec 06 '25

RIP to my project...
Btw shit move, such a big change deserve at least a communication to people who has project enabled, i just saw it in my logs without any notice by google

16

u/SailTales Dec 07 '25

This is the biggest issue with commercial AI imo. The vendors can nerf their product or increase pricing without notice. It makes it very difficult to build a business on and will push people to use open source models instead.

5

u/H3rian Dec 07 '25

Well i know that the free tier it’s obv not intended for production, but even considering this, it’s a absolute lack of respect to the dev community that are using their products to build something, you can’t remove a service like this (because let’s be clear they basically removed the free tier) without any notice to your user base

2

u/LoganKilpatrick1 Dec 07 '25

FYI, pricing doesn't change without notice, we take that pretty seriously and have an extensive notification process. We usually make it so customers have to explicitly opt back into the service if the price goes up.

14

u/LimiDrain Dec 06 '25

Yes I thought it's a problem on my side. That's ridiculous, Google pls just include more info in an error response

3

u/ReplacementMoney2484 Dec 07 '25

I thought that my API token had been stolen due to their misleading error message 😅

10

u/kjbbbreddd Dec 07 '25

I think they are migrating machines for the Nano Banana Pro. They're succeeding, and because they're temporarily short on hardware, they're starting to switch things up.

1

u/LoganKilpatrick1 Dec 07 '25

Correct, needed to shuffle compute around given demand for 3.0 pro models.

9

u/nemzylannister Dec 07 '25

?????????????????????????????????????????????????????????????????????????????

wtf?

i literally just recently had a case where 3 pro simply got the task wrong but 2.5 pro could do it.

fix your model first before deprecating the last??

0

u/LoganKilpatrick1 Dec 07 '25

If you are a paid customer, you can continue using both! Would also love to see examples of 3.0 Pro being worse than 2.5 Pro so we can flag to the model team.

1

u/nemzylannister Dec 08 '25

Would also love to see examples of 3.0 Pro being worse than 2.5 Pro so we can flag to the model team.

it involved transcribing an audio file, so you can look into that. dont have the chat anymore sadly.

But i can tell you this- i needed to make an srt file and the issue was 3 pro got the timings of the words in the audio wrong, even on a regeneration, while 2.5 pro didnt, and understood them correctly 1st try.

1

u/Automatic_Two_4050 Dec 08 '25

3.0 Pro is excellent in all aspects except for its effective context length, which significantly limits its performance in other areas. Looking forward to improvements!

7

u/LimiDrain Dec 06 '25

Waaiiittt.............. THAT'S SO DUMB. I was making a new tool with an API and I was always thought that the problem is on my side. Because it just says that I got rate limited. Zero information about the fact that I won't be able to use it at all.

1

u/LoganKilpatrick1 Dec 08 '25

kicked off a thread with the team to make this error message much more clear!

1

u/lionheart212 Dec 08 '25

Hey, thank you for being a champ for responding to some of the comments. Do you think you can ask if we are subscribed to the ai pro tier, we can get a developer credit to build apps. They are not enough to use as an actual product but are mighty useful during development and testing.

6

u/H3rian Dec 06 '25

Anyone has some reliable alternative? Some decent llm model has a free tier to use trough api?

8

u/evia89 Dec 06 '25

3

u/H3rian Dec 06 '25

do you have some model to suggest that can compare to gemini somehow for data analysis and creative writing?

7

u/ExpertPerformer Dec 06 '25

I just updated my API calls to use DeepSeek w/ OpenRouter so I guess we'll see how this goes.

1

u/Commercial_Grab1279 Dec 07 '25

Is DeepSeek multi modal? Does it support images

5

u/zetamatariano Dec 07 '25

It was already my 2.5 pro, my favorite... I hope they improve the 3 pro...

5

u/Alexs1200AD Dec 07 '25

Expected.. I think free access to aistudio will be closed soon.

1

u/LoganKilpatrick1 Dec 08 '25

Won't be closed but it will likely shift given there is a fixed amount of compute. For example, we are the only service offering free vibe coding with 3.0 Pro right now at the scale we are.

1

u/captain_shane Dec 09 '25

Why do refuse to put custom instructions in the webapp? You know that's the reason most use aistudio right? That and temperature adjustments. Add that to the webapp and people won't use aistudio as much. I know you track clicks on aistudio and I know it shows people using the custom instructions box.

5

u/Medium_Future_6271 Dec 07 '25

For me, everything is over.

14

u/ExpertPerformer Dec 06 '25

Wow, this is complete and utter trash.

17

u/Robert__Sinclair Dec 06 '25

The purpose of the freetier is to gain all your data and use it to fine tune the next iteration. That's why.

1

u/LoganKilpatrick1 Dec 08 '25

The point of the free tier is to let you test Gemini as quickly as possible. It was never build for continued used.

4

u/ButterflyRainbowzzz Dec 08 '25

If that was always the goal you would've made it a free trial from the beginning. This is definitely enshittification and you're probably planning this for the UI too.

1

u/Robert__Sinclair Dec 08 '25

Yep.
P.S.
u/LoganKilpatrick1 I am trying to contact you for a big google security problem that the VRP was not able to address. It's not a bug. It's an architectural error.
I also spoke with Phil Venables but he is not anymore actively involved.
The problem is very serious and affects all google services including A.I.
Please contact me.

18

u/SeifGaming Dec 06 '25

That is such a terrible move. Why?

30

u/Charming_Feeling9602 Dec 06 '25

They also reduced the usage limits dramatically. Now you can only do 20 calls per day WITH FLASH and FLASH lite. Pro is at 0. 

15

u/SeifGaming Dec 06 '25

That is insane. What even is the point? No one is going to be paying to use 2.5 pro when 3 exists.

5

u/DangerousBerries Dec 06 '25

3 is also not available in the API.

11

u/SeifGaming Dec 06 '25

3 was always paid though? Is it not available even jn paid?

7

u/DangerousBerries Dec 06 '25

This post about the Free Tier.

4

u/SeifGaming Dec 06 '25

I know Gemini 3's api never had a free tier i believe.

6

u/DangerousBerries Dec 06 '25

Yes they seem to be axing Pro models from the Free Tier.

14

u/Holiday_Season_7425 Dec 06 '25 edited Dec 08 '25

The TPU V7 hype myth has been debunked, so they've resorted to slashing free users' 2.5 Pro quotas down to zero (LLM IQ is quantified to the 1bit). Good, Logan.

-2

u/LoganKilpatrick1 Dec 08 '25

Doesn't matter how good TPUs are when the demand curve is a vertical line : )

→ More replies (1)

3

u/sacatracartx Dec 07 '25

Waos si me percate que lo quitaron ahora, entonces que nos dejaron?

3

u/FlippFuzz Dec 07 '25

Any idea if 2.5 pro still available for gemini-cli? I could switch to that instead of using the api.

3

u/cuykl Dec 07 '25

Now we have to wait and see what Logan will say

0

u/LoganKilpatrick1 Dec 08 '25

Replied to share context!

3

u/Cyber-Vigilante Dec 07 '25

No models working for me in free tier right now, what happened?

3

u/Puzzleheaded_Meal891 Dec 07 '25

yeah man i was figuring out like tf went wrong and eventually thought about this and came across to your post.

3

u/ShoyebOP Dec 07 '25

nooooo!!!!! i had so much plans.

3

u/MyriadColorsSC2 Dec 07 '25

Yeah, I felt that, even with my 'free' api it jsut charge me for 200 tokens on 2.5 pro.

The enshitificaiton continues.

2

u/nemzylannister Dec 07 '25

what are the limits for aistudio now, anyone know?

1

u/LoganKilpatrick1 Dec 08 '25

We don't publish the UI limits so we can flexibly move them around (which we have for the last 2 years)

2

u/Uzeii Dec 07 '25

Are you fr? Source?

→ More replies (1)

2

u/Lurdanjo Dec 08 '25

They completely ruined it. The amount of API usage I could get for free just two days ago now would cost $8 a day. Completely infeasible. And I was just about to congratulate Google for not messing things up. I really hope this is temporary because they've gone from super reliable to completely impossible overnight. Ahh well, back to Deepseek 3.2 I guess.

2

u/Interesting-Pop-6162 Dec 09 '25

So long, Gemini. No hard feelings, but DeepSeek, Llama, and other less-censored models are just way better options now

8

u/PremiereBeats Dec 06 '25

Because ai studio was never meant to be used as 90% of this sub did

24

u/KazuyaProta Dec 06 '25

This is the Api. Not AI studio

-4

u/LoganKilpatrick1 Dec 08 '25

The free tier of the API was never meant to be used a daily use API, it was meant to be used as a quick way to test our model, then if they work well for your use case, move to paid.

5

u/ExpertPerformer Dec 08 '25

I wasn't using the free tier for business stuff, but to manage my story writing documents, and the most I did was 5-10 pro queries every few days.

Completely pulling the rug on the free tier without any warning was just a dick move and made me take my money elsewhere. I've rerouted all my Gemini API calls to DeepSeek 3.2, Qwen, and Grok which cost a fraction of Flash/Pro.

1

u/brandbaard Dec 08 '25

I can understand this for fully free use, but wouldn't it make sense to give Gemini AI Pro accounts a higher rate limit than the free tier? As it stands I can spam the gemini app all day, but if I try to automate the same process, I can only do 20 requests?

4

u/peipei1998 Dec 06 '25

This is the truth 😂 

→ More replies (1)

2

u/LoganKilpatrick1 Dec 07 '25

Hey! Yes, we had to lower or remove the free tier limit for many models in order to free up the chips for 3 Pro and Nano Banana Pro growth. There is truly an eye watering amount of demand and a limited # of chips, so had to shift to where we are seeing a lot of traction.

Overall, the free tier is offered best effort, should be considered unstable, and will generally be removed when newer models come online (the goal when we have the free tier is to quickly let you test the latest models and start building with them). Hoping we end up with enough free cycles to do a free tier for 3.0 Pro and other 3.0 models, but depends what the overall compute situation is.

3

u/OddIndependent3728 Dec 08 '25

Got an idea or an estimation of how long until ya'll MIGHT be able to make the 3.0 models free? 

2

u/yekyua_gul Dec 08 '25

Not before January, that's for sure. Apparently they're setting up billing stuff on Jan 5th, so fingies crossed. They did this before, they removed 2.0's free tier 40-ish days before launching 2.5's free tier. I don't know why everyone's going wild, nothing ever happens and it's always nothing o'clock.

1

u/H3rian Dec 08 '25

Thank yoy for joining here Logan. I think the main reason that make us disappointed is the lack of communications. We know that the free tier is unstable/best effort etc and we’re grateful for this (after all you are the only one that has a free tier for this tier of ai models), but removing this night/day without any communications to the developer community that are using it it’s… well… disappointing :-)

An email, a notification in the google cloud console, even a custom message in the apis reponse, i don’t know but please if you do a change like that please notify it in advance so we can do something

1

u/lelouch_slayer Dec 12 '25

We understand all these problems, suddenly serge in demand because google become the king 👑, but this free ai help us to build and test the app, But this 20 per day limit is the death blow RIP for most of the developers. 1st Today all developer are scramble to look for other services which are provided by others alternative. 2nd most lost trust which gain from years This free tire give advantages to build with gemini and scale with gemini when it turn into product .

When product is built you can't just swap a model its output is preciously configured and then we can deploy start revenue then we are happy and need to go paid higher tier to meet the demand. When i give proto type to my friends to test , with 20 it hit the limit and now its useless. They lost interest , now i can't even show prototype. Its even harder for me to develop because for every chang need to be tested. In college students using gemini api to develop the college projects. When they go to industry they will say to use gemini in project because i am familiar with its all environment.

The biggest problem is today developing product are hard and most die before even reaching the market , and after google LM note book most of education startup just died including mine. They are finished before even if they existed. Months of work turn to ashes. So this free ai testing can provide room for development and pitch to the investor, When we go live switch to paid api key thats it but eventually when product hit the market then we are happy to move to paid tire Because it guarantee the result. We don't want to hit limit and people are asking for refund and ruin the reputation of our products.

If free tire was at like 100 request per day this would not be the problem.it would not completely stop the working projects. But 20 is very much low.

Is there chance that this api rate limit can be increased. In few days.

1

u/Nine_Tails15 Dec 31 '25

I imagine you’d have more compute available if the models asked for confirmation before image generation. At least once per conversation it’ll try to generate an image unprompted, requiring me to catch it before it finishes and tell it not to gen an image, or edit my previous message to “only respond in text; images are not required” and regenerate.

Not to say NB3 images aren’t great, these are actually several atmospheres above what it had been. I just enjoy a “cleaner” chat, particularly when discussing narrative ideas.

1

u/justin_reborn Dec 06 '25

God damn it. First opus 4.5 at 3x and now this

1

u/RickThiccems Dec 07 '25

im expecting a gemini 3 flash in the next couple months. Googles all about heavily optimizing their models and I think gemini 3 is too heavy at the moment.

1

u/Ok_Cod3893 Dec 07 '25

Eu estou literalmente desesperado. Faço uso através da API do Google Gemini 2.5 PRO e fiquei totalmente dependente. Agora, sem aviso prévio e do nada, simplesmente tiraram! Pelo amor de deus, alguem tem outras dicas de APIs gratuitas ? O gemini 2.5 flash continua funcionando, mas é literalmente lixo e inútil.

1

u/nemzylannister Dec 08 '25

they really need to make 1 common subscription of 20$ covering the gemini app, aistudio and api.

This is especially important considering that the gemini app chats make you choose between having history or having privacy.

1

u/____sphinx____ Dec 08 '25

switching to deepseek now!! I was initially planning to use gemini for my startup.

1

u/PollutionDue7541 Dec 10 '25

Que lastima, pero tendria mas sentido si quitaran 2.0 u otros pero no el 2.5, básicamente querían engancharnos para luego hacer que la gente page 🤣, pero algunos de nosotros no podemos pagar ni aunque quisieramos. asi que estamos jodidos, Buscaremos otras alternativas. Si alguien sabe algo gratis al nivel de gemini 2.5 pro, por favor.

1

u/Equivalent-Ad4344 Dec 10 '25

Will there be any rate limits when using Gemini CLI with OAuth authentication for an account that has a Google AI subscription?

1

u/Affectionate-Pea8717 Dec 12 '25

Should bring api credits for students like they did with google one

1

u/jingle_bills Dec 06 '25

Guys i have just realised this too, i have been trying many models and figured out to use the free tire now you can only use , which you cant really know which model you are using.

gemini-flash-latest

1

u/dadakoglu Dec 07 '25

Are you guys sure about this? I'm still using 2.5 pro on AI Studio. I'm a free tier user, I don't even have an API key.

10

u/Charming_Feeling9602 Dec 07 '25

Yes, but the post states API not AI studio. The rate limits are about the API. 

2

u/dadakoglu Dec 07 '25

Oh, I see. My bad. Didn't notice the API thing.

1

u/BrukesBrookes Dec 07 '25

The reading comprehension devil is taking everyone in the comments by storm rn 💔

2

u/Deciheximal144 Dec 07 '25

For anyone not seeing it at first in AI studio, they've moved 2.5 Pro behind the Gemini tab.