Researchers at Google DeepMind studied whether generative systems can create chess problems judged as artistically valuable by human experts. They trained models to compose puzzles emphasizing aesthetics, novelty, and tactical clarity, then asked players and problemists to score outputs against human-crafted benchmarks. Evaluators preferred several AI-generated compositions and flagged originality patterns distinct from engines’ brute-force lines. For AI researchers and creators, the findings probe computational creativity metrics and evaluation methods, while raising questions about authorship, disclosure standards, and how synthetic compositions should be credited in competitive composing and publishing contexts.