Lab NotesNotesReading Notes

Reading Notes : Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction

About: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973).

Cover of: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973)

Tweets from Privacy by Design (@embedprivacy) signaled the publication of A Primer on Metadata: Separating Fact from Fiction (18 pages PDF document). As I am currently working on a related subject, I read it at once… and was disappointed. The actual primer on what is metadata is only two pages long, rather minimal, inaccurate and not quite convincing.

Metadata (formal definition):

Metadata is (…) essentially information about other information, in this case, relating to our communications.”

In this case : “Metadata is information generated by our communications devices and our communications service providers, as we use technologies like landline telephones, mobile phones, desktop computers, laptops, tablets or other computing devices.”

Cavoukian, 2013, p. 3

Metadata (descriptive definition) : Metadata includes information that reveals the time and duration of a communication, the particular devices, addresses, or numbers contacted, which kinds of communications services we use, and at what geolocations. And since virtually every device we use has a unique identifying number, our communications and Internet activities may be linked and traced with relative ease – ultimately back to the individuals involved.”

Cavoukian, 2013, p. 3

 

As presented in the document, these two definitions are at odds with one another: the formal one referring to information items about other information items; but not the descriptive definition which is rather referring to information about processes. But computer specialists do recognize many kinds of metadata, even though they might use different typologies.

The few lines entitled “A Day in the Life…” (pp. 3-4) provide a good illustration of how (processes) “metadata created by the devices that two individuals use to communicate with each other can reveal a great deal” about them.

Finally, the section “Metadata May Be More Revealing Than Content” (pp. 4-5) reads more like a series of arguments from authority than as an actual demonstration.

Need for evidenced arguments

Coincidently, answering engineering students in a lecture made at Polytechnique Montréal last week, I had to remind that an information set would be metadata, not by some intrinsic nature, but merely by the context of its initial production and use. Classically, the term data referred to information items that are available (or to be produced) for the solution of a problem or the completion of a reasoning, an inquiry or a research. As soon as one so uses “metadata” (what ever the type), they become “data”. Thus, no longer are “metadata”.

From the very first universal purpose computing machine, computers – and digital devices since – require metadata to work. And they also produce other metadata as by-products of their processes. And from the dawn of informatics, those metadata were at once reused as data.

There is nothing new with using metadata to produce knowledge about people. A classic example is the introduction of the computerized cash registers. As the machine processes the customers’ purchases, it produces clock metadata than can be used to asses the clerks’ speeds to punch (now scan) items, to take payments and give change, to pack the goods and pass to the next customers.

Anytime an operation is linkable to a human user, the operations’ metadata can be exploited as data about this human user (and anyone related to that person). Videogames provides good examples of how the same outputs can simultaneously be processes’ metadata and players’ data.

These relative artificiality and mutability of the distinction between data and metadata become obvious when one considers (as these tweet structure maps show) that making a tweet of a maximum of 140 characters can easily require the production of between 500 and 1000 characters of metadata which include… the tweet message itself !

And indeed, the “metadata”/”data” relative weights in todays’ particular instances can often be startling… if one can still distinguish between the two.

Also, need to make evidences evident

How come that there is no readily available button on which I could click to see the whole tweet actually produced, not only the message I wrote and sent?

Or how come that there is no readily available command to display what information my mobile phone service actually produces minute by minute?

And as I pointed out to Polytechnique’s engineering students: if NSA’s work is essentially done with computerized devices, how come Congress does not have a dashboard that harness the metadata about what kinds of operations NSA actually does? If such metadata would have been available, could Director James Clapper, been able to lie so easily about NSA’s operations before Congress? And Congress only discovering it through documents leaked by a whistleblower? After all, would it not be only metadata about systems’ uses, not data from the individual intelligence operations themselves? 😉

Such are questions of critical and practical political significance. Because they breed other questions about who decides the production of such information. About its uses. About who control them. About their consequences. And so on. Of critical and practical significance also because they could turn a defensive stance into one of political affirmation. Such questions stem from an understanding of the nature of what information and information processing are. This is why it is so important to deepen and strengthen such understanding as well as to popularize it and make it useable by all citizens.

So if you know any instructive work on the subject…

DebatesField RemarksInformation & LawLab NotesLiving between the linesNotesObservations

“Lawful access” bill: journalists discovering being targeted

Débats - DebatesA sudden tug of war between the Charest government and journalists caused a shock wave the echoes of which have rippled through throughout the Canadian journalistic profession. A jolt that could help realize how the “lawful access” bill introduced this Monday, Feb. 13 also concerns journalists and media organizations.

A threat

Last week, the Charest government announced that the Director of Criminal and Penal Prosecutions and the Sureté du Québec (provincial police force) would investigate on leaks to media related to the Ian Davidson case, a retired Montreal police officer suspected of attempting to sell lists of police informants to organized crime. Neither the Minister of Public Safety Robert Dutil, nor Premier Jean Charest have agreed to guarantee that journalists would not be investigated or wiretap. (more…)

DebatesInformation & LawLiving between the linesNotesObservations

Winning against the “lawful access” bills: Two strategic intuitions

Débats - DebatesAre there actions we could start today in a decisive campaign against the adoption of so called “lawful access” bills by Canada? I came to answer “yes” while listening to a presentation by Antoine Beaupré, system administrator at Koumbit. It was during a public meeting entitled ” ‘Illegal access’ and the attack of internet freedoms”, on February 3, 2012, in Montreal.

Let’s remind us that the “lawful access” bills that already died three times because of dissolution of Parliament have not been tabled again yet. However, it is expected that the Harper government will go ahead. The latest versions of the legislation gave the police new powers to access data held by Internet services providers (ISPs). They allowed the mandatory disclosure of customer information without judicial oversight, as well as real-time monitoring across ISPs’ networks. All measures deemed unnecessary and dangerous, not only by civil libertarians, but by many police forces also. A detailed legal analysis was published recently by the British Columbia Civil Liberties Association.

The meeting was organized by Koumbit an IT workers coop that offers several services including web hosting: thus, it has already had its share of searches for information and of servers. Like many other businesses it that field, Koumbit fears the effects of the “lawful access” initiatives on the civil liberties of its customers and of all the citizens who use the Internet from anywhere in the world. Indeed, the opening presentation of Antoine Beaupré dealt with less the legal aspects of the bills as of their technical and political dimensions. (more…)

CommunicationsDebatesLab NotesLiving between the linesNotes

Public conversation: Autonomy, Surveillance and Democracy: Who will benefit from the digital traces generated by our every move?

On Thursday, October 6, 2011 (7 to 9 pm), I will be the guest of an University of the Streets Café‘s conversation moderated by Sophie Ambrosi on the theme: Autonomy, Surveillance and Democracy: Who will benefit from the digital traces generated by our every move?

Computers, automatic tellers, phones and other electronic gadgets. Today, our relations with our close ones, other people and organizations go through machines processing thousands of information items about us. These texts, sounds and images become communications, transactions, records, decisions. They can be transformed into statistics and knowledge about individuals, groups and societies, even about the nature of the human animals (e.g., conditions of their health). Knowledge that can base decisions, trivial or major. The information society is necessarily a surveillance society. So what kinds of surveillance are reprehensible in a free and democratic society? And which ones are desirable? Under what conditions?

The conversation will take place at Café l’Artère, 7000, Avenue du Parc (near Jean-Talon) in Montreal. Everyone is invited and admission is free. The event is organized by the Institute for Community Development, Concordia University.

Living between the linesNotesObservations

Encrypted Https Google Search: Effective or Symbolic Measure?

Observations Google recently announced that it now offers the possibility to search for documents in a confidential manner through the secure encrypted internet exchanges protocol Internet Secure Sockets Layer (SSL). Concretely, this means that between your computer and Google’s servers, no one can read directly, nor your queries, nor the searches’ results (just like for your financial transactions are made confidential under SSL).

To benefit from this new “beta” service, one must go to https://www.google.com. Your browser should then indicate that the communication is secure (for example, by displaying a padlock). The localized sites of Google (such as google.fr or google.ca) do not offer this security, nor does this is available either for searching images and videos.

SSL Google Search

For sure, this is a significant symbolic gesture from the web giant. It has been applauded by the Center for Democracy & Technology as “a shining embodiment of the concept of Privacy by Design.”

For sure, the fact that a player as important as Google provides an increasing number of services under SSL (web access to Google Mail under SSL is already the default option since January 2010) could be a important signal to everyone on the Internet: it may be time to think about protecting a larger number of our Internet communications, even if it means slightly slower processing and transmission times (barely noticeable when one has computer and connections with some power).

However, is that new service actually changes anything to the experience of those whose exercise of their liberties or confidentiality of their work requires them to escape the surveillance of their employers, the Internet services providers (ISPs) or States? (more…)

1 comment |
line
footer
Powered by WordPress | Designed by Elegant Themes