Major News from Microsoft’s Phi-3.5, Salesforce, Hugging Face, Hotshot, Luma AI, OpenAI, Condé Nast and ElevenLabs

Last updated

June 22, 2025

Microsoft Unveils Advanced Phi-3.5 AI Models, Surpassing Competitors

Microsoft has launched three new models in its Phi-3.5 series, showcasing significant advancements in AI technology. The models include Phi-3.5 Mini Instruct, optimized for reasoning in resource-limited environments; Phi-3.5 MoE, a mixture of experts model for complex tasks; and Phi-3.5 Vision Instruct, designed for multimodal processing of text and images. Each model demonstrates near-state-of-the-art performance, outperforming competitors like Google’s Gemini and OpenAI’s GPT-4o in various benchmarks. Available under an open-source MIT License, these models aim to enhance AI integration in both commercial and research sectors.

Salesforce Launches xGen-MM Open-Source Models to Enhance Multimodal AI

Salesforce has introduced xGen-MM, a suite of open-source multimodal AI models designed to advance visual language understanding. Known as BLIP-3, these models can integrate and generate content from text and images, representing a significant leap in AI capabilities. The framework includes pre-trained models, datasets, and fine-tuning code, with the largest model featuring 4 billion parameters. A key innovation is the ability to process interleaved data, allowing for complex tasks like answering questions about multiple images. This open-source approach aims to democratize access to advanced AI tools, fostering innovation while raising important discussions about the ethical implications of powerful AI systems. The models are available on Salesforce’s GitHub, encouraging collaboration and transparency in AI research.

Hugging Face Empowers Developers with New Tutorial for Building AI-Powered Robots

Hugging Face has launched a comprehensive tutorial that enables developers to build and train their own AI-powered robots, significantly advancing low-cost robotics. This initiative follows the introduction of the LeRobot platform and aims to democratize access to robotics, traditionally dominated by well-funded corporations. The tutorial provides detailed guidance on sourcing parts and deploying AI models, making robotics accessible to all skill levels. Central to the project is the Koch v1.1 robotic arm, designed for easy assembly. Emphasizing community collaboration, Hugging Face encourages users to share datasets, enhancing AI capabilities. This move not only fosters innovation but also raises important questions about the future of work and ethical considerations in automation. Hugging Face’s efforts mark a pivotal moment in the intersection of AI and robotics, setting the stage for transformative advancements in various industries.

Hotshot Unveils Innovative Text-to-Video AI Generator

Hotshot, a startup founded in 2023, has launched its self-titled text-to-video AI generator. This model allows users to create up to 10 seconds of footage at 720p and is currently available for free, albeit with a limit of two generations per day. Founded by Aakash Sastry, John Mullan, and Duncan Crawbuck, Hotshot previously focused on AI photo creation before pivoting to video. The model was trained over four months using extensive data and GPU resources. While initial results show promise, they may not yet match the quality of established competitors. Sastry anticipates that AI-generated content will soon become integral to digital media, enabling creators to produce entire videos autonomously.

Luma AI Unveils Dream Machine 1.5, Revolutionizing Text-to-Video Generation

Luma AI has launched Dream Machine 1.5, an upgraded text-to-video model that enhances realism and motion tracking while improving prompt understanding. This version allows for custom text rendering within videos, a significant advancement that opens new creative possibilities for dynamic graphics and title sequences. The model also supports non-English prompts, demonstrating its potential for multilingual content. With faster generation times and a focus on user feedback, Luma AI positions itself as a leader in the competitive AI video market. However, the rise of accessible AI video tools raises concerns about misuse, highlighting the need for ethical guidelines.

OpenAI Partners with Condé Nast, Transforming the Future of Publishing

OpenAI has forged a multi-year partnership with Condé Nast, the publisher of renowned titles like Vogue and The New Yorker, aiming to reshape media. This agreement allows OpenAI to access Condé Nast’s extensive content archive to enhance its AI systems, particularly ChatGPT, while providing the publisher with advanced technology tools for content creation and advertising. As tech companies increasingly collaborate with traditional media, this deal raises concerns about potential competition and the use of copyrighted material, especially in light of ongoing legal scrutiny. For Condé Nast, embracing AI signifies a strategic shift to thrive in the digital age, balancing innovation with the preservation of its editorial quality. The outcome of this partnership could offer insights into the evolving relationship between publishing and technology.

ElevenLabs Launches Global Text-to-Speech App Reader, Supporting 32 Languages

ElevenLabs has expanded its AI-powered text-to-speech app, Reader, to a global audience, now supporting 32 languages. Initially launched in the U.S., U.K., and Canada, the app allows users to upload text content such as articles and PDFs, which can be listened to in various languages and voices. The company, which recently became a unicorn after securing $80 million in funding, has enhanced its voice library by licensing the voices of iconic actors. The app utilizes ElevenLabs’ Turbo v2.5 model for improved quality and reduced latency. Future updates will introduce offline support and audio sharing capabilities, positioning Reader as a strong competitor in the text-to-speech market.

Frequently asked questions

What are the key features of Microsoft's new Phi-3.5 AI models?

Microsoft’s Phi-3.5 series includes three specialized models: Phi-3.5 Mini Instruct for efficient reasoning, Phi-3.5 MoE for complex tasks using mixture of experts, and Phi-3.5 Vision Instruct for processing text and images. These models outperform competitors like Google’s Gemini and OpenAI’s GPT-4 in various benchmarks and are available under an MIT License for both commercial and research use.

How does Salesforce's xGen-MM advance multimodal AI capabilities?

Salesforce’s xGen-MM (BLIP-3) represents a significant advancement in multimodal AI by enabling seamless integration of text and image processing. The framework includes pre-trained models with up to 4 billion parameters, comprehensive datasets, and fine-tuning code. Its key innovation is the ability to process interleaved data, allowing for complex tasks like analyzing multiple images simultaneously.

What makes Hugging Face's new robotics tutorial significant?

Hugging Face’s robotics tutorial democratizes access to AI-powered robotics by providing step-by-step guidance for building and training robots. It focuses on the Koch v1.1 robotic arm and includes detailed instructions for sourcing parts and implementing AI models. This initiative makes advanced robotics accessible to developers of all skill levels, breaking down traditional barriers in the field.

What are the capabilities of Hotshot's new text-to-video AI generator?

Hotshot’s text-to-video AI generator can create up to 10 seconds of 720p video footage from text prompts. Currently offered for free with a limit of two generations per day, the model was trained over four months using extensive data. While showing promise, it’s still evolving to match the quality of established competitors in the market.

How does ElevenLabs Reader transform text-to-speech technology?

ElevenLabs Reader is a global text-to-speech app supporting 32 languages, allowing users to convert articles and PDFs into audio content. It uses the advanced Turbo v2.5 model for high-quality voice synthesis and features licensed voices from famous actors. The app offers reduced latency and plans to introduce offline support and audio sharing capabilities.

What impact will the OpenAI-Condé Nast partnership have on publishing?

The partnership allows OpenAI to enhance ChatGPT using Condé Nast’s content archive while providing the publisher with advanced AI tools for content creation and advertising. This collaboration could revolutionize digital publishing by combining traditional media expertise with cutting-edge AI technology, though it raises questions about copyright and competition.

How does Luma AI's Dream Machine 1.5 improve text-to-video generation?

Dream Machine 1.5 enhances text-to-video generation through improved realism, better motion tracking, and enhanced prompt understanding. It introduces custom text rendering within videos and supports non-English prompts. The upgrade also features faster generation times and incorporates user feedback, making it a significant advancement in AI video creation technology.

Gor Gasparyan

Optimizing creative and websites for growth-stage & enterprise brands through research-driven design, automation, and AI

Table of Contents

Explore other categories:

Share on:

Looking to optimize your creative or website?

Access the top 0.5% talent in design, web development, video, UX research, conversion optimization, and AI implementation for a fraction of traditional agency costs.

Submit unlimited requests and receive enterprise-grade deliverables in just 1-3 days. Choose from flexible monthly packages with no long-term commitments or annual packages with greater savings.

Top KPMG Alternatives for Startups

14.01.2026

Comparisons

Top KPMG Alternatives for Startups

14.01.2026

Comparisons

Top PwC Alternatives for Enterprises

14.01.2026

Comparisons

Top PwC Alternatives for Enterprises

14.01.2026

Comparisons

Major News from Microsoft’s Phi-3.5, Salesforce, Hugging Face, Hotshot, Luma AI, OpenAI, Condé Nast and ElevenLabs

Microsoft Unveils Advanced Phi-3.5 AI Models, Surpassing Competitors

Salesforce Launches xGen-MM Open-Source Models to Enhance Multimodal AI

Hugging Face Empowers Developers with New Tutorial for Building AI-Powered Robots

Hotshot Unveils Innovative Text-to-Video AI Generator

Luma AI Unveils Dream Machine 1.5, Revolutionizing Text-to-Video Generation

OpenAI Partners with Condé Nast, Transforming the Future of Publishing

ElevenLabs Launches Global Text-to-Speech App Reader, Supporting 32 Languages

Frequently asked questions

Gor Gasparyan

Explore other categories:

Top KPMG Alternatives for Startups

Top KPMG Alternatives for Startups

Top PwC Alternatives for Enterprises

Top PwC Alternatives for Enterprises

Explore

Recent posts

Top KPMG Alternatives for Startups

Top KPMG Alternatives for Startups

Top PwC Alternatives for Enterprises

Top PwC Alternatives for Enterprises

Top PwC Alternatives for Mid-market companies

Top PwC Alternatives for Mid-market companies

Top KPMG Alternatives for Startups

Top KPMG Alternatives for Startups

Top PwC Alternatives for Enterprises

Top PwC Alternatives for Enterprises