Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

News Sites Push Back Against Apple’s AI Crawlers: Here’s Why

Kaustubh Saini by Kaustubh Saini
September 3, 2024
Reading Time: 4 mins read
Apple AI Crawlers Blocked
Follow us on Google News   Subscribe to our newsletter

News publishers are finally finding out the worth of their human-written content. They don’t want to give it for free to AI Giants. Now, Getting the latest information for the LLMs will not be easy as Apple’s AI crawlers are getting blocked.

Highlights:

  • Major news publishers including The New York Times and Financial Times are blocking Apple AI for training.
  • This is after Apple’s new tool lets websites stop their data from being used to train their AI models.
  • This shows growing tension between content publishers and AI companies on how to adjust in this new world.

Why Are Websites Blocking Apple’s AI?

So, this is what happened. Apple launched a tool that allows websites to opt out of their content from being accessed, especially for training their AI models. This is a great thing for website owners but maybe not for the iPhone manufacturer. At least, that’s what WIRED’s new report says.

As we know, AI models like GPT-4o (technically you can say ChatGPT) need information to train their systems. They get this by getting content from websites on the internet. But they also need the latest up-to-date information about the current events to provide as much accurate information as they can.

So, they also want to get the information from news publishers. But that doesn’t mean they get it for free. They need permission from these sites, otherwise, it would be an infringement of Intellectual Property!

Big news publishers like The New York Times, The Financial Times, Vox Media, and The Atlantic have already chosen to block Apple’s AI crawlers from using their content.

The new crawler called Applebot-Extended gives control to publishers over how their content can be used to train Apple’s foundation models that will later be used to power generative AI features across their products.

The company said, “Allowing Applebot-Extended will help improve the capabilities and quality of Apple’s generative AI models over time.” But why should news publishers give their hard-worked content for free?

Here’s why News Publishers are doing it.

News Publishers have updated their robots.txt files to block Applebot-Extended user agents. They’re doing this to make sure their content isn’t used without their permission. Vox Media’s Lauren Starke said:

“We’re blocking Applebot-Extended across all of Vox Media’s properties, as we have done with many other AI scraping tools when we don’t have a commercial agreement with the other party.”

Lauren Starke, Vox Media

This could lead to changes in how AI models are trained and might also result in new deals where companies pay to access content. OpenAI has partnered with many news publishers recently and they have made a deal with Stack Overflow as well.

So, if the AI giants want content, they need to pay the price for it. Nothing comes free for them. The decisions being made now could have a big impact on how AI and online content work together in the future.

Note that the new user agent is an update to their original web crawler, Applebot. While the latter helps power search features like Siri, this new extension lets websites choose if their data can be used to train their AI models. News publishers don’t have a problem with Applebot but they are not ready for Applebot-Extended.

In a similar study, Ben Welsh has found that about 25% of the news websites out of 1,167 (mainly from the US) are blocking the AI crawlers as well.

Conclusion:
The fight over AI data scraping is heating up, with more websites blocking their content from being used to train LLMs. This could change how AI is developed and how online content is protected. The choices being made today will likely shape the future of AI and digital content.

ShareTweetShareSendSend
Kaustubh Saini

Kaustubh Saini

I'm Kaustubh Saini, founder of FavTutor. I love breaking down complex AI concepts, trends, and news, writing about them until an AGI agent takes over my job. When I’m not writing, I’m building AI-powered tools to make learning more accessible and engaging at FavTutor.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.