AI companies like Anthropic and OpenAI are facing a surge of copyright lawsuits as they push the boundaries of available data to train their models. As AI advances, companies have increasingly turned to scraping data from the web, leading to legal challenges from publishers who claim their content is being exploited without permission. This issue has become more pressing as AI firms hit a "data frontier," forcing them to seek new data sources or rely on synthetic data. The outcome of these cases, such as the New York Times' lawsuit against OpenAI, could reshape how AI companies access and use content.