Have a Cookie is a Google Chrome Extension for analyzing, managing and restricting cookies based on personal browsing history
Good Cookie vs. Bad Cookie
Aren’t all cookies bad? (1)
Well yes and no. Nothing is ever black and white and when it comes to cookies, they are no exception.
For the purpose of dicussion, I’ll break cookies into two categories: Those that are set by a willful action of the user, for example, signing into a website that sets a cookie to keep you logged in, we’ll call these user-set; And those cookies that are set automatically, or by an unknown action, not willfully done by the user, we’ll call these foreign-set.
So for the most part, user-set are good and foreign-set are bad right? Hardly, and painting a picture that this is the case is absurd, if not outright fraud. Cookies get set for a incomprehensible number of different purposes, but for only one reason: to provide state (aka memory) to a stateless transport vehicle, http. The unofficial cookie FAQ is a great, but long, read on the subject.
Then what is a good cookie? (1) Well that all depends on what you’re comfortable having passed around the internet. Over the years sites have always used iframes to serve in 3rd party content. A large portion (I couldn’t find any real numbers) of these tend to be ads. However, some of the latest widgets and buttons from Facebook and Twitter, as well as a host of other sites, leverage iframes to show you ‘you’re friends’ and allow you login or comment easily without needing an additional account.
These cookies are how disqus knows who you are on avc.com and markcoatney.com as well as on disqus.com. The list goes on and on. Technically, if you’re going to call foreign-set bad, these cookies fall into that category. They are 3rd party sites, which in this case you actively visited at some point, who are setting cookies on a site automatically, and typically without your known consent. That is to say, all these sites publish terms of service and privacy policies which entitle them to set these cookies, but you most likely had no say if techcrunch or techdirt.com decided to place such widgets on their site. I certainly don’t believe foreign-set are bad in this case, but their are people who do and articles and tools to accomodate that, such as this life hacker article.
So then that just means the cookies set by ads are bad, right? (1) That’s what I thought, so I embarked on an experiment. Using Chrome, I installed Adblock, which recently gained the ability to completely block ad resources from loading. I thought my cookie troubles had ended. So when the WSJ privacy series began, and the subsequent conversation it sparked enflamed, I began to wonder just how many cookies Had my machine acquired? More specifically, just how many cookies had I acquired for sites without any entry in my history, meaning my browser had never actually been there.
The findings where astounding. My machine is only 5 months old, and while I am active on the internet, it is generally spent on technology related themes. I rarely venture into the tabloidesque blogs that are renowned for sleazy ads with dancing cowboys and giant peeling bananas.
I immediately recorded my initial findings based on a simple filter for google analytics style cookies, regardless of source, in a tweet:
1161 cookies from google analytics, or 37% of 3205 cookies. Almost 40% of all the cookies on my computer are from from Google?!
I continued to dig deeper into the subject and examine the number of different cookies domains, the number of normalized domains and compare those domains against my browser history. 9,017 unique URLs were found. The oldest of which dated to April, close to when I began using Chrome. An ideal environment for a comparison, though highly biased based on my own preventive techniques.
Of all the cookies and history data analyzed, 286 normalized domains had no history entry, which left 732 cookies. This was out of a total of 1,258 domains that collapsed into 827 normalized domains. I found 152 normalized domains where I had actively visited the homepage and the remaining 389 normalized domains had a history entry. A one time analysis found that 99.9% of the history entries in this final group where loaded actively by the user (myself), typically through active navigation.
The assumption was that cookies whose domain could not be found in my history are ‘bad’, but not all foreign-set cookies would be bad, and the conclusion was yes. Of the 286 domains I examined, I could not find a single cookie whose usage could be useful to me personally. That is to say those cookies have no value to me, but do have financial or statistical value to others.
Sites such as facebook, twitter, disqus cookies were discovered in the group of domains I actively visited, as expected, however, preventing foreign-set or '3rd party' cookies outright stops this cookie (not very useful, always). The 389 domains with history entries generally consisted of things such as passwords, preference settings, usernames or misc.
So then what is a good cookie? (1) The answer, like almost everything in the real world, is it all depends, that’s where Have a Cookie comes in. It’s a tool to provide the level of granularity needed to control user-set and foreign-set cookies so you can decide what a good cookie and bad cookie is. Or at least, that’s what it aspires to be. Learn more about the release state in this post.
- 1: ( I love asking questions that I will immediately answer )
The project is currently open source and available on github at http://github.com/gregory80/Have-a-Cookie
Currently, it is available only as an unpacked extension. Feedback welcome, more to come on project intent and timelines.