Semantics-Aware Cookie Purpose Compliance

Apr 22, 2025ยท
Baiqi Chen
Jiawei Lyu
Jiawei Lyu
,
Tingmin Wu
,
Mohan Baruwal Chhetri
,
Guangdong Bai
ยท 0 min read
Abstract
Websites commonly display cookie banners to inform users about the use and purposes of cookies. However, they may still, whether intentionally or unintentionally (e.g., due to third-party libraries imported), mis-declare cookies that may be abused for tracking. In this work, we introduce COOVER (cookie value examiner) to assess the non-compliance between the website-declared purpose and the semantic-intended purpose of cookies (denoted as potential cookie purpose violation ). We advocate that the value of the cookie is a more reliable indicator of its semantic-intended purpose compared to other features such as expiration time. COOVER decomposes the cookie value into primitive segments representing minimal semantic units, and fine-tunes a GPT-3.5 model to automatically interpret their value-inferred semantics. Based on the interpretation, it classifies cookies into four GDPR-defined purposes. COOVER achieves an F1 score of 95%, significantly outperforming other methods. We employ COOVER to analyze Alexa Top 1k websites to understand the status quo of potential cookie purpose violation on the web. Remarkably, out of 15,339 cookies across these websites, only 3.1% quality as truly necessary cookies, while 44.1% of websites suffer from issues of potential purpose violation.
Type
Publication
In Proceedings of the ACM on Web Conference 2025