AI
AI News

Latest AI News

Deep-learning-based pan-phenomic data reveals the explosive evolution of avian visual disparity

Deep-learning-based pan-phenomic data reveals the explosive evolution of avian visual disparity

A recent study harnesses deep learning, specifically a ResNet34 model, to analyze avian morphological evolution by recognizing over 10,000 bird species. It reveals that the model's high-dimensional embedding space captures phenotypic convergence and morphological disparity linked to species richness, underscoring richness as a key factor in morphospace expansion. Post-K-Pg extinction patterns show an "early burst" in diversity. Notably, the study also highlights the model's ability to form hierarchical structures in a flat-label training context, challenging assumptions about CNNs' reliance on local textures.

arXiv
SymPlex: A Structure-Aware Transformer for Symbolic PDE Solving

SymPlex: A Structure-Aware Transformer for Symbolic PDE Solving

SymPlex introduces a novel reinforcement learning framework for deriving analytical solutions to partial differential equations (PDEs) without needing ground-truth data. It employs a structure-aware Transformer, SymFormer, to optimize solutions based solely on the PDE and its boundary conditions. This approach enables interpretable solutions that effectively handle non-smooth behaviors, offering a significant advancement over traditional numerical methods. Empirical tests show SymPlex accurately recovers complex PDE solutions, highlighting its potential for practical applications in mathematical modeling and engineering.

arXiv
Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning

Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning

Researchers have developed DualSpeed, a framework to enhance the training efficiency of Multimodal Large Language Models (MLLMs) by addressing the inefficiencies related to massive model sizes and visual tokens. DualSpeed uses a dual-mode approach: a fast mode that employs Visual Token Pruning (VTP) to reduce visual tokens, and a slow mode that trains on full sequences for consistency. This method significantly accelerates training—2.1x for LLaVA-1.5 and 4.0x for LLaVA-NeXT—while maintaining over 99% performance. Code is available on GitHub.

arXiv
Dassault Systèmes and NVIDIA Partner to Build Industrial AI Platform Powering Virtual Twins

Dassault Systèmes and NVIDIA Partner to Build Industrial AI Platform Powering Virtual Twins

A new shared industrial AI architecture integrates Virtual Twins with scalable AI infrastructure, enhancing real-time decision-making in manufacturing. This science-validated model optimizes processes, enabling industries to leverage AI for predictive maintenance and improved operational efficiency. The architecture positions industrial AI as a crucial tool in modern production environments.

Nvidia.com
Darren Aronofsky, Your AI Slop Is Ruining American History in 'On This Day…1776'

Darren Aronofsky, Your AI Slop Is Ruining American History in 'On This Day…1776'

The new short film series "On This Day…1776" opens with a poignant visual of a hand brushing over the title page of Thomas Paine's "Common Sense," highlighting its historical significance. This series aims to explore pivotal events of the American Revolution, providing context and insight into the period's influential figures and ideas.

CNET
Why more consumers prefer AI-enhanced shopping - and still expect the human touch

Why more consumers prefer AI-enhanced shopping - and still expect the human touch

A recent ZDNET report reveals that 73% of consumers are utilizing AI chatbots for product searches, reflecting a growing trend in e-commerce. The article emphasizes the need for businesses to integrate AI tools to enhance customer engagement and streamline the shopping experience, as consumer reliance on these technologies rises.

ZDNet
Elon Musk's SpaceX officially acquires Elon Musk's xAI, with plan to build data centers in space | TechCrunch

Elon Musk's SpaceX officially acquires Elon Musk's xAI, with plan to build data centers in space | TechCrunch

SpaceX announced its acquisition of Elon Musk's AI startup, xAI, marking a significant expansion into artificial intelligence. This merger positions SpaceX as the world’s most valuable private company. The integration aims to leverage xAI's technology to enhance SpaceX's operations and decision-making processes, potentially streamlining its ambitious projects in space exploration.

TechCrunch
HHS Is Using AI Tools From Palantir to Target ‘DEI’ and ‘Gender Ideology’ in Grants

HHS Is Using AI Tools From Palantir to Target ‘DEI’ and ‘Gender Ideology’ in Grants

The Department of Health and Human Services has been utilizing AI tools from Palantir since March to enhance the screening and auditing processes for grants and job descriptions. This initiative aims to ensure compliance with federal regulations and improve oversight. The integration of these AI tools is expected to streamline operations and reduce errors in grant management.

Wired
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

PixelGen is a novel pixel diffusion framework that bypasses the limitations of traditional two-stage latent diffusion models by optimizing directly in pixel space. It employs two perceptual losses—LPIPS for local patterns and DINO for global semantics—to enhance image quality. PixelGen achieves a competitive FID of 5.11 on ImageNet-256 with just 80 training epochs and shows strong performance in large-scale text-to-image tasks, evidenced by a GenEval score of 0.79. This approach eliminates the need for VAEs and auxiliary stages, offering a streamlined and effective generative model. Full code is available at GitHub.

arXiv
Expanding the Capabilities of Reinforcement Learning via Text Feedback

Expanding the Capabilities of Reinforcement Learning via Text Feedback

A recent study introduces RL from Text Feedback (RLTF), leveraging text critiques to enhance large language models post-training. Unlike traditional methods, RLTF uses multi-turn reinforcement learning, allowing models to internalize feedback without requiring extensive demonstrations. Two techniques, Self Distillation and Feedback Modeling, were tested on various tasks and consistently outperformed existing baselines, indicating that text feedback can significantly improve model performance efficiently.

arXiv
Multi-head automated segmentation by incorporating detection head into the contextual layer neural network

Multi-head automated segmentation by incorporating detection head into the contextual layer neural network

A new gated multi-head Transformer architecture, based on Swin U-Net, improves auto-segmentation in radiotherapy by integrating inter-slice context and a parallel detection head. This model effectively reduces false positives, achieving a mean Dice loss of $0.013 \pm 0.036$ compared to $0.732 \pm 0.314$ for traditional methods. This advancement enhances the reliability of automated contouring in clinical settings.

arXiv
Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel

Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel

A recent study highlights the challenges of implementing Expert Parallel (EP) communication in hyperscale mixture-of-experts (MoE) models during training. The communication model requires an all-to-all approach, complicated by dynamics and sparsity. The findings suggest that enhancing EP communication efficiency is crucial for optimizing MoE performance, which could significantly improve training times and resource utilization in large-scale machine learning environments.

Nvidia.com