Reader

Top News

Introducing the next generation of Claude

Anthropic introduces the Claude 3 model family, setting new benchmarks in AI capabilities with three models: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, catering to varying needs of speed, intelligence, and cost. Opus and Sonnet are now accessible via claude.ai and the Claude API in 159 countries, with Haiku launching soon. Opus excels in comprehension and fluency, showcasing near-human performance on tasks like undergraduate and graduate level knowledge, basic mathematics, and more. All models improve upon previous versions in analysis, content creation, code generation, and multilingual conversation. They offer faster processing times, with Haiku being the fastest and most cost-effective, Sonnet being twice as fast as its predecessors, and Opus maintaining high intelligence with competitive speed. Enhanced vision capabilities allow for processing diverse visual formats, and a significant reduction in unnecessary refusals has been achieved. Accuracy improvements include a twofold increase for Opus in answering complex factual questions. The models also feature a 200K context window, with potential for 1 million token inputs, and demonstrate robust recall capabilities.

Nvidia bans using translation layers for CUDA software to run on other chips — new restriction apparently targets ZLUDA and some Chinese GPU makers

Nvidia has updated its licensing terms to prohibit the use of translation layers for running CUDA-based software on non-Nvidia hardware platforms. This change, which was not included in the documentation of previous versions but is present in CUDA 11.6 and newer, appears to target initiatives like ZLUDA and some Chinese GPU makers that have been using translation layers to run CUDA code. The move is seen as an attempt to protect Nvidia's dominance in the accelerated computing space, particularly with AI applications. However, recompiling existing CUDA programs for other platforms remains legal, and as more competitive hardware enters the market, Nvidia's dominance could potentially be challenged.

Competition in AI video generation heats up as Deepmind alums unveil Haiper

DeepMind alumni Yishu Miao and Ziyu Wang have launched Haiper, an AI-powered video generation tool, amid increasing competition in the field. The duo, who have backgrounds in machine learning and 3D reconstruction, shifted their focus to video generation six months ago, finding it a more intriguing problem. Haiper, which has raised $13.8 million in seed funding, allows users to generate short videos for free using text prompts, with additional features such as animating images and repainting videos. While the company is currently focused on its consumer-facing website, it plans to develop a core video-generation model for commercial use. Haiper faces competition from other AI video generation tools like OpenAI's Sora and Google and Nvidia-backed Runway.

Other News

Tools

Cloudflare announces Firewall for AI - Cloudflare has developed Firewall for AI, a protection layer for Large Language Models (LLMs) that identifies and prevents abuses and attacks, addressing the unique vulnerabilities and threats introduced by LLMs as Internet-connected applications.

Wix’s new AI chatbot builds websites in seconds based on prompts - Build a website using Wix's new AI chatbot by answering a few prompts, and then edit it in more conventional ways to create a personalized design.

Introducing TripoSR: Fast 3D Object Generation from Single Images - TripoSR introduces fast 3D object generation from single images, comparing its reconstructions with OpenLRM and emphasizing the use of diverse data rendering techniques to improve model generalization.

Kai-Fu Lee’s AI Company “01.AI” Announces the Open Source of the Yi-9B Model - "01.AI" announces the open source of the Yi-9B model, the most powerful in the Yi series, with impressive code and mathematical capabilities, surpassing other open-source models of similar size.

Business

OpenAI Fires Back at Musk Allegations With Trove of Emails - OpenAI responds to Elon Musk's lawsuit with evidence from his own emails, accusing him of trying to make the company part of Tesla Inc.

Waymo launches driverless rides for employees in Austin - Waymo launches driverless rides for employees in Austin, a crucial step before opening the program to the public, as the company steadily expands its autonomous ride-hailing program.

Baidu Launches China's First 24/7 Robotaxi Service - Baidu's Apollo Go launches China's first 24/7 robotaxi service, expanding its autonomous driving operations and offering special services for female users.

AWS launches Generative AI Competency to grade AI offerings - AWS launches Generative AI Competency program to validate and highlight partners with proven customer success in generative AI, making it easier for businesses to identify and adopt the best-suited AI solutions.

Key OpenAI Executive Played a Pivotal Role in Sam Altman’s Ouster - OpenAI's chief technology officer, Mira Murati, played a pivotal role in the ouster of Sam Altman, raising concerns about his management and sharing them with the board.

Midjourney Accuses Stability AI of Image Theft, Bans Its Employees - Midjourney accuses Stability AI of image theft, leading to a ban on its employees, while both CEOs deny involvement and promise to cooperate with the investigation.

Research

Stable Diffusion 3: Research Paper - Stable Diffusion 3 outperforms other text-to-image generation systems in prompt following, typography, and visual aesthetics, and offers multiple variations to eliminate hardware barriers.

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation - PixArt-Σ is a Diffusion Transformer model capable of generating high-fidelity 4K images from text prompts, achieving superior image quality and user prompt adherence with significantly smaller model size.

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models - A new TTS system, NaturalSpeech 3, utilizes factorized diffusion models and a neural codec with factorized vector quantization to generate natural speech in a zero-shot way, outperforming state-of-the-art TTS systems on quality, similarity, prosody, and intelligibility.

Beyond Language Models: Byte Models are Digital World Simulators - Byte models, like bGPT, use next byte prediction to simulate the digital world, achieving high performance across various modalities and offering new possibilities for predicting, simulating, and diagnosing algorithm or hardware behavior.

Design2Code: How Far Are We From Automating Front-End Engineering? - AI has made significant progress in generating code from visual designs, with multimodal language models showing promise in converting designs into code implementations, as demonstrated by benchmarking and human evaluations.

Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap - Proposing a framework for evaluating reasoning capabilities of language models using functional benchmarks, the article identifies a significant reasoning gap in state-of-the-art models, prompting the need to build "gap 0" models with minimal performance differences.

StarCoder 2 and The Stack v2: The Next Generation - StarCoder 2 and The Stack v2, developed as part of the BigCode project, introduce larger training sets and models that outperform others in code language modeling benchmarks, with a commitment to openness and transparency.

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection - A new training strategy called GaLore reduces memory usage by up to 65.5% in optimizer states while maintaining efficiency and performance for pre-training and fine-tuning large language models, making it feasible to pre-train a 7B model on consumer GPUs without model parallel, checkpointing, or offloading strategies.

KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents - KnowAgent introduces a novel approach to enhance the planning capabilities of Large Language Models (LLMs) by incorporating explicit action knowledge, resulting in improved performance in complex reasoning tasks and mitigating planning hallucinations.

Researchers tested leading AI models for copyright infringement using popular books, and GPT-4 performed worst - Leading AI models, including GPT-4, were tested for copyright infringement using popular books, with results showing that GPT-4 produced copyrighted content on 44% of prompts.

Concerns

OPENAI’S GPT IS A RECRUITER’S DREAM TOOL. TESTS SHOW THERE’S RACIAL BIAS - AI-powered hiring tools, such as OpenAI's GPT, are found to systematically produce biases based on names, posing a serious risk for automated discrimination at scale, despite efforts to mitigate bias and increase objectivity.

We Hacked Google A.I. for $50,000 - Hackers collaborate to exploit vulnerabilities in Google's AI systems, uncovering an IDOR in Bard, a DoS vulnerability in Google Cloud Console, and a data exfiltration flaw in Bard's Google Workspace support, earning a total of $50,000 in bounties.

Google engineer indicted over allegedly stealing AI trade secrets for China - Google engineer indicted for allegedly stealing AI trade secrets for China, including confidential files on Google’s tensor processing unit (TPU) chips, and transferring them to a personal Google Cloud account.

Microsoft engineer warns company's AI tool creates violent, sexual images, ignores copyrights - Microsoft engineer raises concerns about the AI image generator, Copilot Designer, for creating violent and sexual images, ignoring copyrights, and lacking safeguards.

Fake Image Factories - AI image generators are creating election disinformation in 41% of cases, prompting the need for responsible safeguards, collaboration with researchers, and clear pathways to report abuse.

Inside the World of AI TikTok Spammers - AI is being used to create low-quality spammy videos by recycling other people's content, allowing individuals to make money passively by flooding social media platforms with stolen and low-effort clips.

Man tries to steal driverless car in L.A., doesn’t get far: police - Man attempts to steal driverless car in Los Angeles but fails to operate the controls, leading to his arrest, sparking concerns about the safety and regulation of autonomous vehicles.

Top AI researchers say OpenAI, Meta and more hinder independent evaluations - Top AI researchers are calling on generative AI companies to allow independent access to their systems, arguing that strict protocols are hindering safety-testing and chilling independent evaluations.

Gender Bias in AI (International Women's Day edition) - Gender bias in AI models reflects and exaggerates existing gender biases from the real world, and it is important to quantify and address such biases in order to mitigate them.

Policy

AMD Hits US Roadblock in Selling AI Chip Tailored for China - AMD's AI chip tailored for the Chinese market is deemed too powerful to sell without a license by US officials, leading to a potential roadblock for the company.

Explainers

I used generative AI to turn my story into a comic—and you can too - A generative AI platform called Lore Machine can turn text into images, allowing users to create comics and graphic novels from their stories with ease.

Your guide to Google Gemini and Claude 3.0, compared to ChatGPT - Google Gemini and Claude 3.0, compared to ChatGPT, are the latest powerful language models that are changing the AI tools landscape, offering different features and capabilities for users to consider.