Claude Can Now Analyze Visuals in PDFs

Plus: Runway Brings 3D Controls to AI Video Generation

Welcome to Get The Gist, where every weekday we share an easy-to-read summary of the latest and greatest developments in AI—news, innovations, and trends—all delivered in under 5 minutes! ⏱

In today’s edition:

  • Anthropic’s Claude Can Now Analyze Visuals in PDFs

  • Runway Brings 3D Controls to AI Video Generation

  • NVIDIA is Advancing “Physical AI” for Healthcare

  • Microsoft Is Rebranding Its AI Services

  • And more AI news….

Top Developments

Anthropic

Image by: Anthropic

The Gist: Anthropic's Claude 3.5 Sonnet AI model now offers Visual PDF analysis, allowing users to extract information from images, charts, and graphs in documents under 100 pages, ideal for technical documents and research papers.

Key Details:

  • Visual PDFs, available in Feature Preview, let Claude analyze visual elements in PDFs, not just text.

  • Document size limits have increased to 30MB each, with a maximum of five images or documents per upload.

  • Users can enable Visual PDF functionality for API requests, enhancing flexibility in data handling.

  • Additional recent updates include Claude’s Analysis Tool for JavaScript execution, LaTeX rendering for equations, and integration with GitHub Copilot.

Runway

Image by: Runway

The Gist: Runway’s latest update to its Gen-3 Alpha Turbo model allows users to add cinematic 3D effects and precise camera movements to AI-generated videos, giving creators new tools for immersive storytelling.

Key Details:

  • The new camera controls enable smooth zooms, panning, and trucking, enhancing the 3D realism of AI-generated scenes.

  • Creators can adjust movement speed and direction to add depth and build suspense or focus on details.

  • Improved control over character and background positioning helps avoid visual glitches common in previous models.

  • Runway’s tools are aimed at professional and independent filmmakers, with industry partnerships, including with Lionsgate, supporting cinematic-quality productions.

Nvidia

Image by: Nvidia

The Gist: NVIDIA is advancing “physical AI” for healthcare, aiming to transform hospitals with AI-driven robotics that assist in medical procedures, patient monitoring, and hospital logistics.

Key Details:

  • NVIDIA’s “physical AI” initiative envisions AI-empowered hospitals where robots can interact with the physical environment to support medical care.

  • Collaborating with Mark III, NVIDIA is developing digital twins of hospital settings to train AI and clinicians, allowing surgical practice in virtual environments.

  • NVIDIA has invested in robotics company Moon Surgical and medical scribe company Abridge, emphasizing partnerships over direct healthcare industry entry.

  • Future applications could extend to robots performing non-surgical tasks, such as patient monitoring and logistics, within healthcare settings.

Quick Gist

Microsoft is rebranding its AI services under "Windows Intelligence," which may eventually replace the Copilot branding while aiming to integrate AI more deeply across its products (Read More).

Abu Dhabi National Oil will implement agentic AI developed by Microsoft and G42 to enhance operational efficiency in the energy sector, promising faster seismic surveys and improved production forecasts (Read More).

Google's Big Sleep team has successfully utilized a large-language model to discover a zero-day vulnerability in SQLite, marking a significant advancement in AI-assisted security research (Read More).

OpenAI has hired former Pebble CEO Gabor Cselle to work on a "secret project," as anticipation grows for the upcoming release of its powerful new AI model, Orion (Read More).

OpenAI CEO Sam Altman announced that while GPT-5 will not be released this year, the company is focusing on GPT-o1, a new model designed for specialized problem-solving, and hinted at future capabilities for independent task execution (Read More).

Google is rolling out Gemini extensions in the Google Home app, enabling users to control smart home devices using natural language, gradually replacing Google Assistant for these functions on Android (Read More).

Google updated its Gemini 1.5 Pro model with improvements in data quality and performance for Gemini Advanced subscribers, enhancing accuracy in various tasks since its last update in October (Read More).

Meta launched Purple Llama, an open-source initiative to create tools for developers to enhance the trustworthiness and safety of generative AI models in collaboration with various tech partners (Read More).

THAT’S IT FOR TODAY!

That’s it for today, see you tomorrow! 👋

If you have any questions, feedback, or requests, drop us an email at [email protected]. We love hearing from our readers! 😊

P.S. If this email was forwarded to you, you can sign up for free by clicking here!