Google DeepMind’s AI Fact-Checker Outperforms Humans
Google’s DeepMind research unit has just unveiled a groundbreaking AI system that can evaluate the accuracy of information better than humans. Introducing SAFE, or Search-Augmented Factuality Evaluator. This AI fact-checker breaks down generated text into individual facts and uses Google Search results to determine the accuracy of each claim. In a head-to-head battle against human annotators, SAFE’s assessments matched human ratings 72% of the time. And in cases where SAFE and humans disagreed, SAFE was found to be correct 76% of the time. While some experts question what “superhuman” really means in this context, there’s no denying that SAFE has the potential to revolutionize fact-checking. One clear advantage? Cost. Using SAFE is about 20 times cheaper than human fact-checkers. As the volume of information generated by language models continues to explode, having an economical and scalable way to verify claims will be increasingly vital.
Google Gemini AI Coming to Android Tablets, Coexisting with Google Assistant (For Now)
Google’s new generative AI model, Gemini, is making its way to your devices, and it looks like it might be able to coexist with Google Assistant, at least for now. Currently available on Android phones, Gemini AI is expected to eventually replace Google Assistant, the virtual assistant used for voice commands. When installed on phones, users have to choose between Gemini and Google Assistant. But a recent discovery in the Google Search app’s code suggests that things might be different for tablets. The code refers to using Gemini AI on a “tablet,” along with several features, and it appears that the Google app will host Gemini AI on tablets, rather than a standalone app like on phones. As this is still a beta version of the Google Search app, Google could always change its mind and not roll out these features. But for now, it looks like Android tablet users might get to enjoy the best of both worlds with Gemini AI and Google Assistant.
Elon Musk Unveils Grok-1.5: Closing In on GPT-4 Performance
Elon Musk’s xAI has just announced Grok-1.5, a major upgrade to its proprietary large language model, mere weeks after open-sourcing Grok-1. Set to release next week, Grok-1.5 brings enhanced reasoning and problem-solving capabilities, closing in on the performance of industry giants like OpenAI’s GPT-4 and Anthropic’s Claude 3. While it still falls slightly behind GPT-4 and Claude 3 on the MMLU benchmark, xAI expects to continue these improvements with Grok-2, which Musk says should exceed current AI on all metrics. Grok-1.5 will initially be available to early testers and those using the Grok chatbot on the X platform, with a phased rollout to a wider set of users over time. Musk has also announced that followers with a certain level of verified subscriber followers will get Premium and Premium+ subscription benefits, including Grok, for free.
Amazon Invests Record $2.75B in Anthropic, Doubling Down on AI
Amazon has just announced a massive $2.75 billion investment in Anthropic, the company behind the powerful Claude 3 family of large language models. This brings Amazon’s total investment in the OpenAI rival to a staggering $4 billion, making it the largest venture investment in the e-commerce and cloud computing giant’s history. Anthropic has been making waves lately with the release of Claude 3, which has taken the crown from OpenAI as the most powerful AI model in the world. This investment is a clear sign that Amazon sees a major upside in convincing customers to use its cloud services, build AI apps with its Bedrock platform, and do so using cutting-edge models like Claude 3. As the AI race heats up, this record-breaking investment in Anthropic could be a game-changer, cementing Amazon’s position as a major player in the world of artificial intelligence.