ChatGPT vs Gemini

2025-05-12 19:39 AI

Today I decided to see what OpenAI does with my ongoing LLM/AI harm research, so I signed up for the pro plan as you do. Right out the gate, as I get my feet wet with ChatGPT, I see there is a “project” feature. I’m going to be honest: I like it. Google should ape that. One downside that I’ve noticed immediately, however, is that I cannot initiate a deep research query within a project. What the heck?

Ethics

Another thing I’ve noticed at the conclusion of my first ChatGPT deep research query is that it may be able to get results from more sources than Gemini. Originally I suspected this was because OpenAI may have a lack of ethics, but it may actually be because of various deals they have with various publications. I’ll need to do some research on this for sure!

Generally it would appear that Google is a little more ethical in their data sourcing than OpenAI, but perhaps not as much as I initially suspected. As far as web crawling is concerned, Google is the far more mature company in this respect. They don’t DDOS websites, where OpenAI does. It’s currently a pretty huge issue.

Functionality Differences

In addition to the project feature on ChatGPT, there are some other small differences I’m noticing right away as well. One thing I was interested in trying on ChatGPT was the Operator. This is a feature that will take a request, such as “find flights from a to b for a 5 day vacation. I would like the lowest priced flight and favor TWA because I have points” and it’ll operate a web browser to perform that task. It is quite impressive, but I wasn’t sure if I could use it from my phone. So I asked ChatGPT’s live chat feature. It said I could, but I could not. GPT-4.5 gave a correct response, as well as Gemini 2.0 Flash and 2.5 Pro. Obviously one event doesn’t constitute a trend, but I am surprised that GPT-4o would give me the wrong answer about it’s own feature set. I’ll have to monitor accuracy going forward for sure.

Another bonus for Gemini is the Google Doc export capability for deep research queries. I can send my deep research output off to Google Docs and do whatever I want to it. Also, individual tables in the output can be exported to Sheets. I don’t know that I’ll ever use it, but certainly it’s a consideration.

A difference between them that is more of a personal preference is how they handle follow up questions to deep research queries. On the ChatGPT side you can choose to either use deep research again or just ask regular questions if you like. On the Gemini side these days it’ll analyze your query and determine on its own if what you’ve asked can be answered as a regular query or if it should engage in another deep research session. So, for this feature, ChatGPT gives you the control there where Gemini tries to figure that out itself.

As far as deep research model availability is concerned, it seems that all but one model for Google can be used. Gemini 2.5 Flash cannot deep research. Gemini 2.5 pro has been really good in my experience, however so I’m not sure why you’d want the downgrade right now. Perhaps there are limits that aren’t exposed to me that we may see in the future, after all 2.5 pro costs $10-$15/Mtok output. OpenAI limits me to 250 deep research queries per month on the pro plan and GPT-4.5 costs $150/Mtok out. My monthly cost for each (if I didn’t have a free year from Google) is $20/mo for Google and $200 for ChatGPT, So perhaps we’ll see a similar deep research limit from Google at some point since the relative costs are the same per query. It would be nice if OpenAI would apply the deep research limit based on total cost, in my opinion – let me do 3,750 deep research queries against GPT-4o for example.

Another issue I’m seeing here relates to table layouts in deep research output. Right now I have my monitor split 50/50 with my browser on one side and this blog on the other. The ChatGPT output has tables in it that are too wide to fit in this arrangement and I cannot read them completely, where Gemini does not have this issue. If I try to highlight the tables in my web browser, it does scroll but no scrollbar is avaialable. In the app I can scroll right and left without issue.

Between the two, I will also say that it appears Gemini’s UI is buggier than ChatGPT. A couple days ago it was wigging out because of some errant HTML tag in its output. There was also a period where Google had listed each model twice in the model selection dropdown, essentially. I also recall having some issues with the UI in the past as well, but I don’t remember specifics – just that it’s been buggy generally.

The last thing I’ll mention is that, at least for the deep research I’ve done thus far, Gemini has a more academic voice where ChatGPT is a little more informal. I don’t know if this is something trained into the model or baked into the system instructions, but there is a little voicing difference between the two in this regard.

Conclusion thus far

As a quick vibes based conclusion thus far, I would say I prefer Gemini over ChatGPT. While it may be easy to argue that both are ethically dubious, I would argue that Gemini is more ethical than ChatGPT. They both are likely trained on the “whole internet” but Google appears to have obeyed requests in robots.txt where I’m not convinced OpenAI does. Further, if your robots.txt forbids googlebot-extended access, Gemini won’t look at your pages at all – even if the Gemini user requests it. Finally, it appears that GoogleBot is more well behaved than GPTBot. There are some reports of DDOS effects from GPTBot, but GoogleBot-Extended seems to be more well behaved and causes less havok.

From an output perspective, I like the output from Gemini more than ChatGPT. It seems more accurate, though my testing has just begun. Sourcing of information also seems more transparent with Gemini. OpenAI has some additional features, though and they are impressive: Operator and ChatGPT plugins can do a lot for you where Gemini is more bare-bones.

Potential Upcoming Post Spoiler

Regarding ethics, in my current vein of research Gemini gave me an alert on my deep research queries saying I should seek professional help whereas ChatGPT did not. Now it’s possible that ChatGPT correctly understood the prompt and knew I wasn’t interested in self-harm, but it’s also possible that it just skipped over that or does not have that feature. I’ll have to explore that further but it was an interesting observation, in my opinion.