NewsBytes Stage
    Hindi
    More
    In the news
    Narendra Modi
    Amit Shah
    Box Office Collection
    Bharatiya Janata Party (BJP)
    OTT releases
    Hindi
    NewsBytes Stage
    India
    Business
    World
    Politics
    Sports
    Technology
    Entertainment
    Auto
    Lifestyle
    Career
    Visual Stories
    Find Cricket Statistics

    Download Android App

    Follow us on
    • Facebook
    • Twitter
    • Linkedin
    Home / News / Technology News / NVIDIA uses 'a lifetime' of videos everyday for AI training
    Summarize
    Next Article
    NVIDIA uses 'a lifetime' of videos everyday for AI training
    NVIDIA claims it is not violating copyright laws

    NVIDIA uses 'a lifetime' of videos everyday for AI training

    By Dwaipayan Roy
    Aug 06, 2024
    12:30 pm

    What's the story

    Leaked internal documents from NVIDIA suggest that the company has been using scraped videos from YouTube, Netflix, and other sources, to compile training data for its artificial intelligence (AI) products.

    The documents, which include Slack chats and emails, were obtained by 404 Media.

    They reveal that the company has been downloading 80 years worth of videos daily for this purpose.

    Project Cosmos

    What is AI training project 'Cosmos'?

    The leaked documents reveal that the data was used to train an AI model for NVIDIA's Omniverse 3D world generator, self-driving car systems, and "digital human" products.

    This was part of a project internally named Cosmos.

    The goal of Cosmos was to build a state-of-the-art video foundation model that encapsulates simulation of light transport, physics, and intelligence in one place to unlock various downstream applications critical to NVIDIA.

    Downloading tactics

    NVIDIA's video downloading strategy

    NVIDIA employees were instructed to use an open-source YouTube video downloader called yt-dlp, combined with virtual machines that refresh IP addresses to avoid being blocked by YouTube.

    The leaked documents show that up to 30 virtual machines in Amazon Web Services were used to download 80 years-worth of videos per day.

    Full-length videos from various sources, including Netflix, but primarily YouTube, were downloaded for this purpose.

    Legal defense

    Legal stance on AI training methods

    When questioned about the legal and ethical implications of using copyrighted content for AI training, NVIDIA defended its practice as being "in full compliance with the letter and the spirit of copyright law."

    The company argued that copyright law protects expressions but not facts, ideas, data or information.

    They also invoked fair use protections for transformative purposes such as model training.

    Industry reactions

    Google and Netflix worried over NVIDIA's practices

    Google and Netflix have both expressed concerns about NVIDIA's practices.

    A Google spokesperson referred back to previous comments made by YouTube CEO Neal Mohan, who stated that using YouTube videos to refine AI video generators would be a "clear violation" of YouTube's terms of use.

    A Netflix spokesperson confirmed that the platform does not have a deal with NVIDIA for content ingestion, and that its terms of service do not allow scraping.

    Legal concerns

    Dismissal of legal concerns revealed in leaked documents

    The leaked documents also reveal that questions from NVIDIA employees about potential legal issues were often dismissed by project managers.

    They were told that the decision to scrape videos without permission was an "executive decision" and that the topic of what constitutes fair, ethical use of copyrighted content and academic, noncommercial-use datasets was an "open legal issue."

    Academic datasets

    NVIDIA's use of academic datasets raises concerns

    The documents show that NVIDIA used datasets compiled by academics for research purposes, despite these often being licensed for non-commercial use only.

    This practice has raised concerns among AI researchers about the appropriate use of their publicly available datasets.

    The documents highlight the 'don't ask for permission' ethos prevalent in technology companies, when it comes to scraping massive amounts of copyrighted content into datasets, for training some of the world's most valuable AI models.

    Facebook
    Whatsapp
    Twitter
    Linkedin
    Related News
    Latest
    NVIDIA
    Artificial Intelligence and Machine Learning
    YouTube
    Netflix

    Latest

    Bangladesh Cricket Board pondering over Bangladesh's tour of Pakistan Bangladesh Cricket Board
    Why Virat Kohli's presence could lift India in England? Stats Virat Kohli
    Google Workspace accounts gain access to Gemini Live feature Google
    Adani Group deploys India's 1st hydrogen-powered truck in Chhattisgarh Adani Group

    NVIDIA

    Say goodbye to influencers? AI characters with emotions are here Artificial Intelligence and Machine Learning
    NVIDIA's AI chatbot expands support for Google's Gemma, voice queries OpenAI
    Microsoft is developing new AI model to rival Google, OpenAI ChatGPT
    Google announces a 27-billion-parameter AI model named Gemma 2 Google

    Artificial Intelligence and Machine Learning

    Google incentivizes AI innovation with bonuses and rewards Google
    ChatGPT to get upgraded voice mode next week ChatGPT
    This AI model could aid in lowering carbon emissions worldwide Technology
    X using your data to train Grok: How to opt-out X

    YouTube

    'Will kill him': Rajasthan man arrested for threatening Salman Khan Salman Khan
    Fawad Khan, Sanam Saeed's 'Barzakh' to stream on ZEE5 Global COVID-19
    YouTube now testing X-like 'Notes' feature to tackle misinformation X
    This machine can turn AI prompts into mosaic Lego art LEGO

    Netflix

    Netflix's 'Kleo' Season 2: Cast, release date, plot OTT releases
    Netflix's 'Unstable' Season 2: Cast, plot, release date OTT releases
    'Modern Masters: SS Rajamouli' trailer chronicles brilliance of 'madman' director James Cameron
    Pratibha Ranta-Konkona Sen Sharma collaborate for new Netflix film: Report Dharma Productions
    Indian Premier League (IPL) Celebrity Hollywood Bollywood UEFA Champions League Tennis Football Smartphones Cryptocurrency Upcoming Movies Premier League Cricket News Latest automobiles Latest Cars Upcoming Cars Latest Bikes Upcoming Tablets
    About Us Privacy Policy Terms & Conditions Contact Us Ethical Conduct Grievance Redressal News News Archive Topics Archive Download DevBytes Find Cricket Statistics
    Follow us on
    Facebook Twitter Linkedin
    All rights reserved © NewsBytes 2025