Welcome back to another edition of Beyond Tomorrow.
Every week, companies and researchers are pushing the envelope of artificial intelligence a little more than before. This week, Big Tech had some big releases, while over in the east, some scientists have made a breakthrough that has the potential to revolutionise how we do research. Let's have a look, shall we?
Google has only been increasing its investments in AI. The release of the Pixel 9 line of smartphones at the Made By Google event this week shows just how serious they are about it. The foursome of their latest Pixel offering— the Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL and the Pixel 9 Pro Fold—will now come with Gemini as the default assistant, replacing Google Assistant.
Elon Musk's xAI might be shaping up to be a serious contender in the world of AI startups and large language models. Guess xAI is turning out to be a better investment than X.
xAI Launches Beta Version of Grok-2
Earlier this week, xAI released the beta version of the Grok-2 family of large language models. The family includes two models, Grok-2 and Grok-2 mini, both of which will be accessible to X Premium and Premium+ members.
Compared to Grok-1.5, the new model is a "significant step forward," according to the company's blog post. At the time of the blog's writing, the company said that Grok-2 was outperforming both Anthropic's Claude 3.5 Sonnet and OpenAI’s ChatGPT-4 Turbo.
Shortly after Grok-2 was released, users were quick to point out that the model's guardrails against copyrighted images and controversial content were severely lacking. Not only was it able to bypass copyrights, but it also had no hesitation in showing objectionable content, like children as victims of school shootings.
However, at the time of this writing, the loopholes regarding showcasing controversial and gory content have been fixed. Grok-2 now typically responds by refusing to showcase such content, citing it as 'sensitive,' and saying that such requests could be seen as "promoting or depicting violence or sensitive political acts". It's definitely a step in the right direction. However, the issue regarding copyrighted imagery persists.
The image generation part of Grok-2 is actually powered by another company called Black Forest Labs and uses their FLUX.1 model for the tasks.
The Grok-2 family will become available to developers through xAI's new enterprise API platform later this month. xAI says that the upcoming API is built on a "new bespoke tech stack that allows multi-region inference deployments for low-latency access across the world."
It looks like self-declared "free-speech absolutist" Elon Musk might finally have created something worthwhile with his greatest superpower: money.
Google Launches AI-Ready Pixel 9 Smartphones
While OpenAI had been talking up its Advanced Voice Mode over the past couple of months, Google was quietly working in the background. We got to see the results this week with the release of the Pixel 9 series of phones, which are all AI-enabled with Gemini Live.
Google positions Gemini Live as a "mobile conversational experience" that lets people have "free-flowing conversations" with the LLM. What's even better is the fact that Google just up and released it to users with the announcement. Meanwhile, OpenAI's Advanced Voice Mode is still languishing in Alpha and only available to select users.
Still, Google's track record with their AI products hasn't exactly been the most stellar or stable. For those keeping count back at home, it's 0-3. Gemini's predecessor, Bard, was prone to providing incorrect or false information. Then, there was Gemini creating black Nazis. And most recently, AI Overviews, which was prone to suggesting people put rocks in their spaghetti.
But Gemini Live seems like it has the potential to change how people view Google's AI products. It currently comes with a range of 10 voices for users to pick from, versus OpenAI's three. Google claims that you can interrupt Gemini Live, similar to how you'd have a conversation with someone.
Additionally, users will be able to integrate Gemini Live across a suite of Google products to make day-to-day tasks easier, but the extensions will be rolled out over the coming weeks.
Gemini Live has already started rolling out to Gemini Advanced subscribers at the time of this article.
Other features coming with the Pixel 9 include the Add Me feature, Pixel Studio, and a new feature called Call Notes. Here's a quick breakdown of what each of them does:
Add Me: Using augmented reality and AI, users will now be able to add people missing in a photograph to the frame. Which is pretty cool, until you start thinking about all the ways people are likely to misuse it.
Pixel Studio: Users will have access to Google DeepMind’s text-to-image model Imagen 3 to create images that they'd like to create. The technology itself will be on the cloud and not on the phone itself.
Call Notes: This one is controversial, to say the least. Google will basically save a "private summary" of your conversation with a person after a phone call. It's surprising what all of us are willing to give up in the name of convenience. The company says that the feature runs fully on-device and the other person on the call will also be notified. Take that as you will from the company that quietly changed its motto from "Don’t be evil" to "Do the right thing."
What Else We Covered This Week:
And … that’s it for this week! If you liked what you read, share it with folks you know. If you didn't, send us some feedback. Until next time!
Also Read: Ola’s Krutrim To Launch AI Chips by 2026
Beyond Tomorrow is a weekly newsletter published every Saturday to give you a roundup of everything AI in the last week.