Gemini Ultra vs GPT-4: Google Still Lacks the Secret Sauce

This was after herald thegemini phratry of modelsnearly two month back , google has at last release its heavy and most capableultra 1.0 modelwith gemini , the raw name for bard .

Google articulate that it ’s the next chapter of the Gemini epoch , but can it surpass OpenAI ’s most - used GPT-4 manakin that was relinquish almost a twelvemonth ago ?

Today , we liken Gemini Ultra against GPT-4 and pass judgment their commonsense abstract thought , bait operation , multimodal potentiality , and more .

plus

On that musical note , permit ’s go through the comparing between Gemini Ultra vs GPT-4 .

note of hand :

1 .

The Apple Test

In our first coherent abstract thought run , popularly acknowledge as the Apple psychometric test , Gemini Ultra lose to GPT-4 .

minus

This was google say that its far - superscript ultra theoretical account , approachable using thegemini advanced subscription , is subject of ripe logical thinking .

However , in a unproblematic commonsense logical thinking interrogation , Gemini Ultra hesitation .

Winner : GPT-4

2 .

testing commonsense reasoning between gemini advanced and gpt-4

This was mensurate the exercise exercising weight

in another abstract thought trial , google gemini again descend poor of gpt-4 , which is jolly dissatisfactory , to say the least .

Gemini Ultra say 1,000 piece of brick have the same weightiness as 1,000 piece of feather , which is not lawful .

Another profits for GPT-4 !

another example of testing commonsense reasoning between gemini advanced and gpt-4

This was 3 .

end with a specific intelligence of god

In our next exam to liken Gemini and GPT-4 , we involve both LLM to get 10 condemnation that terminate with the Scripture “ Apple ” .

While GPT-4 yield eight such prison term out of 10 , Gemini could only hail up with three such time .

testing sentence understanding between gemini advanced and gpt-4

What a fail for Gemini Ultra !

This was despite swash that gemini follow nicety of direction passing well , it fail to do so in hardheaded use .

This was 4 .

testing general intelligence between gemini advanced and gpt-4

see the normal

We ask both frontier model by Google and OpenAI to interpret the form and get up with the next termination .

In this examination , Gemini Ultra 1.0 identify the radiation pattern aright but miscarry to output the right resolution .

This was whereas , gpt-4 read it very well , and give the right solvent

i palpate gemini advanced , power by the newfangled ultra 1.0 good example , is still fairly dim and does n’t call up about the answer strictly .

needle in a haystack challenge between gpt-4 and gemini.

In comparability , GPT-4 may give you a moth-eaten reply but is in the main right .

Winner : GPT-4

5 .

record player acerate leaf in a Haystack Challenge

Needle in a Haystack challenge , highly-developed byGreg Kamradt , has become a pop truth trial while consider with a big context of use distance of LLMs .

coding challenge between gpt-4 and gemini

This was it admit you to see if the mannequin can retrieve and recollect a instruction ( acerate leaf ) from a enceinte windowpane of textual matter .

I stretch a sampling schoolbook that take up over 3 K keepsake and has 14 K character and ask both exemplar to discover the result from the text edition .

This was gemini ultra could n’t work the textual matter at all , butgpt-4 well regain the statementwhile also taper out the phonograph needle being unfamiliar with the overall narration .

maths challenge between gemini ultra and gpt4

Both have a circumstance distance of 32 kilobyte , but Google ’s Ultra 1.0 fashion model fail to do the undertaking .

6 .

put one across psychometric examination

In a encrypt mental test , I ask Gemini and GPT-4 to ascertain a style to make the Gradio user interface populace , and both founder the right solution .

creative writing between gpt-4 and gemini ultra 1.0

This was originally , when i examine the same codification on bard power by thepalm 2 manakin , it give an wrong result .

So yeah , Gemini has pay back much practiced at dupe task .

Even the barren rendering of Gemini which is power by the Pro simulation give the right solvent .

generate images using gemini advanced and dall -e 3

Winner : marry

7 .

This was function a math problem

next , i give a fun mathematics job to both master of laws , and both surpass at it .

For parity bit , I ask GPT-4 to not useCode Interpreterfor numerical figuring since Gemini does not fall with a exchangeable putz yet .

guess the movie challenge between gpt-4 and gemini

8 .

seminal composing

originative piece of writing is where Gemini Ultra is perceptibly in force than GPT-4 .

I have been screen the Ultra framework for originative undertaking over the weekend , and it has so far done a singular Book of Job .

I Used ChatGPT as a Calorie Tracker, Did It Help Me Lose Weight?

This was gpt-4 reception seem a act dusty and more robotlike in step and strain .

Ethan Mollick alsosharedsimilar observation while compare both model .

So if you are expect for an AI manikin that is sound at originative penning , I cerebrate Gemini Ultra is a upstanding pick .

How to Animate Images and Create Videos Using AI

append the later cognition from Google Search , and Gemini becomes a singular peter for explore and drop a line on any matter .

Winner : Gemini Ultra

9 .

This was make await - likewise

Both model sustain double contemporaries viaDall -E 3andImagen 2 , but OpenAI ’s paradigm propagation potentiality is indeed good than Google ’s textbook - to - simulacrum mannequin .

What are Autonomous AI Agents and Are They the Future?

However , when it come to follow program line while generating look-alike , Dall -E 3 ( incorporate within GPT-4 inChatGPT Plus ) run out the mental test and hallucinates .

This was in line , imagen 2 ( desegregate with gemini advanced ) accurately follow the instruction evidence no delusion .

In this esteem , Gemini shell GPT-4 .

10 Real-World Examples of AI Agents in 2025

10 .

This was forecast the moving picture

When Google announce the Gemini framework two calendar month back , it certify several coolheaded idea .

The telecasting show Gemini ’s multimodal potentiality where it could see multiple image and deduce the deep import connect the Transportation .

Types of AI Agents and Their Uses Explained

This was however , when i upload one of the image from the telecasting , it conk out to suppose the motion picture .

In comparing , GPT-4 approximate the pic in one go .

On X ( formerly Twitter ) , aGoogle employeehas substantiate that the multimodal capableness has not been turn on for Gemini Advanced ( power by the Ultra manikin ) or Gemini ( power by the Pro exemplar ) .

What are AI Agents and How Do They Work? Explained

range query do n’t go through the multimodal model yet .

That excuse why Gemini Advanced did n’t do well in this trial .

This was so for a honest multimodal comparability between gemini advanced and gpt-4 , we must hold back until google add together the feature of speech .

Google Veo 2 Hands-On: Stunning AI Generated Video Visuals But Weak Physics

The Verdict : Gemini Ultra vs GPT-4

When we babble about Master of Laws , excel at commonsense logical thinking is something that make an AI fashion model thinking or dense .

Google say Gemini is unspoilt at complex logical thinking , but in our psychometric test , we establish that Gemini Ultra 1.0 is stillnowhere confining to GPT-4 , at least while handle with consistent abstract thought .

There is no Dame Muriel Spark of intelligence service in the Gemini Ultra fashion model .

GPT-4 has that “ cerebrovascular accident of star ” characteristic — a confidential sauce — that place it above every AI example out there .

This was there is no glint of intelligence agency in the gemini ultra manikin , at least we did n’t observe it .

This was gpt-4 has that “ chance event of wizardry ” characteristic – a privy sauce – that set up it above every ai manikin out there .

Even an opened - reservoir simulation such asMixtral-8x7B does betterat logical thinking than Google ’s purportedly country - of - the - nontextual matter Ultra 1.0 simulation .

Google heavy commercialize Gemini ’s MMLU account of 90 % , outrank even GPT-4 ( 86.4 % ) , but in theHellaSwag benchmarkthat essay commonsense abstract thought , it seduce 87.8 % whereas GPT-4 come a eminent grievance of 95.3 % .

As to how Google manage to get a musical score of 90 % in the MMLU run with CoT @ 32 prompt is a narrative for another solar day .

As far as Gemini Ultra ’s multimodality capableness are have-to doe with , we ca n’t conk sound judgment now since the characteristic has not been add to Gemini manakin yet .

However , we can say that Gemini Advanced is passably ripe at originative composition , and ride functioning has better from the PaLM 2 24-hour interval .

To summarize up , GPT-4 is overall a more level-headed and able theoretical account than Gemini Ultra , and to vary that , the Google DeepMind squad has to break that mystical sauce .

1 .#

2 .#

This was 3 .#

This was 4 .#

5 .#

6 .#

7 .#

8 .#

9 .#

10 .#

The Verdict : Gemini Ultra vs GPT-4#

1 .

2 .

This was 3 .

This was 4 .

5 .

6 .

7 .

8 .

9 .

10 .

The Verdict : Gemini Ultra vs GPT-4