Lab NotesNotesReading Notes

Reading Notes : Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction

About: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973).

Cover of: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973)

Tweets from Privacy by Design (@embedprivacy) signaled the publication of A Primer on Metadata: Separating Fact from Fiction (18 pages PDF document). As I am currently working on a related subject, I read it at once… and was disappointed. The actual primer on what is metadata is only two pages long, rather minimal, inaccurate and not quite convincing.

Metadata (formal definition):

Metadata is (…) essentially information about other information, in this case, relating to our communications.”

In this case : “Metadata is information generated by our communications devices and our communications service providers, as we use technologies like landline telephones, mobile phones, desktop computers, laptops, tablets or other computing devices.”

Cavoukian, 2013, p. 3

Metadata (descriptive definition) : Metadata includes information that reveals the time and duration of a communication, the particular devices, addresses, or numbers contacted, which kinds of communications services we use, and at what geolocations. And since virtually every device we use has a unique identifying number, our communications and Internet activities may be linked and traced with relative ease – ultimately back to the individuals involved.”

Cavoukian, 2013, p. 3


As presented in the document, these two definitions are at odds with one another: the formal one referring to information items about other information items; but not the descriptive definition which is rather referring to information about processes. But computer specialists do recognize many kinds of metadata, even though they might use different typologies.

The few lines entitled “A Day in the Life…” (pp. 3-4) provide a good illustration of how (processes) “metadata created by the devices that two individuals use to communicate with each other can reveal a great deal” about them.

Finally, the section “Metadata May Be More Revealing Than Content” (pp. 4-5) reads more like a series of arguments from authority than as an actual demonstration.

Need for evidenced arguments

Coincidently, answering engineering students in a lecture made at Polytechnique Montréal last week, I had to remind that an information set would be metadata, not by some intrinsic nature, but merely by the context of its initial production and use. Classically, the term data referred to information items that are available (or to be produced) for the solution of a problem or the completion of a reasoning, an inquiry or a research. As soon as one so uses “metadata” (what ever the type), they become “data”. Thus, no longer are “metadata”.

From the very first universal purpose computing machine, computers – and digital devices since – require metadata to work. And they also produce other metadata as by-products of their processes. And from the dawn of informatics, those metadata were at once reused as data.

There is nothing new with using metadata to produce knowledge about people. A classic example is the introduction of the computerized cash registers. As the machine processes the customers’ purchases, it produces clock metadata than can be used to asses the clerks’ speeds to punch (now scan) items, to take payments and give change, to pack the goods and pass to the next customers.

Anytime an operation is linkable to a human user, the operations’ metadata can be exploited as data about this human user (and anyone related to that person). Videogames provides good examples of how the same outputs can simultaneously be processes’ metadata and players’ data.

These relative artificiality and mutability of the distinction between data and metadata become obvious when one considers (as these tweet structure maps show) that making a tweet of a maximum of 140 characters can easily require the production of between 500 and 1000 characters of metadata which include… the tweet message itself !

And indeed, the “metadata”/”data” relative weights in todays’ particular instances can often be startling… if one can still distinguish between the two.

Also, need to make evidences evident

How come that there is no readily available button on which I could click to see the whole tweet actually produced, not only the message I wrote and sent?

Or how come that there is no readily available command to display what information my mobile phone service actually produces minute by minute?

And as I pointed out to Polytechnique’s engineering students: if NSA’s work is essentially done with computerized devices, how come Congress does not have a dashboard that harness the metadata about what kinds of operations NSA actually does? If such metadata would have been available, could Director James Clapper, been able to lie so easily about NSA’s operations before Congress? And Congress only discovering it through documents leaked by a whistleblower? After all, would it not be only metadata about systems’ uses, not data from the individual intelligence operations themselves? 😉

Such are questions of critical and practical political significance. Because they breed other questions about who decides the production of such information. About its uses. About who control them. About their consequences. And so on. Of critical and practical significance also because they could turn a defensive stance into one of political affirmation. Such questions stem from an understanding of the nature of what information and information processing are. This is why it is so important to deepen and strengthen such understanding as well as to popularize it and make it useable by all citizens.

So if you know any instructive work on the subject…

DebatesField RemarksInformation & LawLab NotesLiving between the linesNotesObservations

“Lawful access” bill: journalists discovering being targeted

Débats - DebatesA sudden tug of war between the Charest government and journalists caused a shock wave the echoes of which have rippled through throughout the Canadian journalistic profession. A jolt that could help realize how the “lawful access” bill introduced this Monday, Feb. 13 also concerns journalists and media organizations.

A threat

Last week, the Charest government announced that the Director of Criminal and Penal Prosecutions and the Sureté du Québec (provincial police force) would investigate on leaks to media related to the Ian Davidson case, a retired Montreal police officer suspected of attempting to sell lists of police informants to organized crime. Neither the Minister of Public Safety Robert Dutil, nor Premier Jean Charest have agreed to guarantee that journalists would not be investigated or wiretap. (more…)

DebatesInformation & LawLiving between the linesNotesObservations

Winning against the “lawful access” bills: Two strategic intuitions

Débats - DebatesAre there actions we could start today in a decisive campaign against the adoption of so called “lawful access” bills by Canada? I came to answer “yes” while listening to a presentation by Antoine Beaupré, system administrator at Koumbit. It was during a public meeting entitled ” ‘Illegal access’ and the attack of internet freedoms”, on February 3, 2012, in Montreal.

Let’s remind us that the “lawful access” bills that already died three times because of dissolution of Parliament have not been tabled again yet. However, it is expected that the Harper government will go ahead. The latest versions of the legislation gave the police new powers to access data held by Internet services providers (ISPs). They allowed the mandatory disclosure of customer information without judicial oversight, as well as real-time monitoring across ISPs’ networks. All measures deemed unnecessary and dangerous, not only by civil libertarians, but by many police forces also. A detailed legal analysis was published recently by the British Columbia Civil Liberties Association.

The meeting was organized by Koumbit an IT workers coop that offers several services including web hosting: thus, it has already had its share of searches for information and of servers. Like many other businesses it that field, Koumbit fears the effects of the “lawful access” initiatives on the civil liberties of its customers and of all the citizens who use the Internet from anywhere in the world. Indeed, the opening presentation of Antoine Beaupré dealt with less the legal aspects of the bills as of their technical and political dimensions. (more…)

CommunicationsDebatesInformation & LawLab NotesLiving between the linesNotesReflections

Autonomy, Surveillance and Democracy: A Few Ideas for the Twenty-First Century

Text derived from my presentation

to the Citizen Forum on surveillance of communications

organized by the Quebec caucus of the New Democratic Party

Montreal, Notman House, Thursday, November 3, 2011

Regardless of the fate of the bill named “Lawful Access”, the information society will continue to develop. Then again, an information society is necessarily a surveillance society. Hence the question: what role the parliaments, governments and civil society should play to not only preserve freedoms and democracy, but to enhance them?

Here I propose – in quick rough strokes due to the short time available – some ideas for reference in regard to challenges the twenty-first century presents to us.

Social Life and Surveillance

Idea # 1: Surveillance is an integral component of all social life.

This is true of all human societies, likewise of many animal societies, and even vegetal ones.

Idea # 2: Surveillance takes many forms with very different, even opposite consequences.

I am a grandfather. Obviously I watched my children and grandchild. However, the forms that such surveillance takes can lead children to more and more autonomy, or, conversely, to dependence and submission.

That is why, idea # 3: The concepts proposed by author Ivan Illich of autonomy versus heteronomy, conviviality and counterproductivity are useful to this discussion.

These concepts can be applied, for example, to a convivial urban neighborhood that combines the functions of housing, labor, commerce and recreation. Such an area appears safe because its residents, workers, passersby and idle bystanders spontaneously and freely offer themselves mutual, continuous, autonomous surveillance.

Conversely, an unconvivial single function neighborhood that is deserted during the night or day appears to generate insecurity. No expensive police, guards or electronic surveillance will succeed to produce real security. And such surveillance is likely to increase heteronomous forms of power over individuals and community.

Hence, idea # 4: It is important to consider the complex interrelationships between environmental, physical, social and technical structures and conditions, on the one hand, and the forms of surveillance that these structures permit or not as well as their effects, on the other hand.

Assessment Criteria

And therefore, idea # 5: Respect for freedom is a necessary, but totally insufficient assessment criteria (thus ineffective alone).

In addition, idea # 6 (stated earlier): The information society is necessarily a society where surveillance is becoming widespread, increasing in power and scope, and is being democratized.

Let us illustrate this with a surveillance activity which, unlike the “Lawful Access” bill on the State’s power over private communications, is conducted by private actors on public communications, namely: the high-frequency stock transactions which constitute some 60% of the volume of North American exchanges. This surveillance involves the use of computers that, each microsecond, monitor and analyze all transactions around the planet. This surveillance allows the same computer to purchase securities at one instant and resale them a few seconds later at a profit. The speeds of surveillance, analysis and decision making are so great that human operators can only control possible failure occurrences. Such as those that caused the Flash Crash of May 6, 2010 when these automatic systems suddenly made the Dow Jones Index to plunge several hundred points within a few minutes.

Such capabilities are becoming more democratic. Let’s remember that today a lower end smartphone is already more powerful than these big central computer that, in the sixties, most thought only States could afford. That the customers of data mining software, indispensable to produce results from digital surveillance, are roughly divided into four areas: academic (teaching and research), business (marketing, R & D), police and military intelligence, and we call civil society (various organizations and individuals). That information items on the behavior of individuals and organizations have never been produced in such large numbers or have never been more accessible (just take all the wealth of personal information items disseminated via social media).

Some surveillance activities can easily be described as harmful, such as surveillance of the private communications of citizens or of their legitimate political activities. Other surveillance activities can easily be described as beneficial, such as those about who funds political parties and about who does what lobbying with which decision makers.

However, idea # 7: The majority of the surveillance activities that will emerge will not be so easily assessed: understanding their nature and their effects will require deliberations.

So idea # 8: Drawing on a proposition from economists Samuel Bowles and Herbert Gintis, we could state that: all surveillance should be subjected to the application of the principles of freedom, but that any surveillance involving some exercise of social power should also be subjected to the principles of democracy.

These principles are to be applied, no matter the public or private nature of the actions being monitored; or the state, commercial or civilian identity of those conducting the surveillance.

Logically, the same principles should also apply to the decision making on environmental, physical, social and technical structures and conditions that determine the forms surveillance may or may not take. Indeed, various social movements express the same demand, whether about shale gas extraction or high finance business: one’s obligation to subject to the action of another called for one’s right to know and right to have one’s say.

As a Preliminary Conclusion

Idea # 9: Such radical democratization calls for deep legal, parliamentary and political transformations from the local to the international levels.

Such changes could indeed be facilitated by possible information societies’ developments.

However, idea # 10: The exact forms that these changes should take remains yet to be defined.

Here, our situation is similar to those of different protests movements (such as Occupy Wall Street) that clearly identifies how current practices are unacceptable without being able to define what should be the alternatives. However, it is as equally urgent to conceive concrete solutions. Let us illustrate with two cases.

Electronic payment

The first case is about privatization of a decision of a public nature. It is the introduction in North America of smart banking cards that raise issues of individual and societal surveillance. Electronic payment is a “radical monopoly” to use another concept from Ivan Illich: if citizens retain the choice of the financial institution that will provide the banking card, there is only one electronic payment system that is imposed on all financial institutions and to all their customers on a given territory.

However, the choice of a new microprocessor based payment system is not trivial. This is because there are dozens of concepts for implementing this technology that are quite different in terms of individual surveillance. Some concepts can make electronic payments as anonymous as the use of paper money. For example, the financial institution knows by the end of the day that it should debit the account of such customer to such total amount, but remains unable to connect this with the various suppliers where the customer has spent money. At the other end of the spectrum, there are concepts that provide the financial institution with a wealth of information about who has purchased what from whom precisely at what time and how much. The choice between one type of concept and another has little to do with technical or budgetary constraints. It is in practice a political decision on the level of surveillance that financial institutions may or may not carry on the activities of their clients. But it is not elected parliaments that decide. Rather, parliaments have left the decision to private clubs of financial institutions (in Canada, to the Canadian Payments Association).

But it’s not just the surveillance of individual clients that is at issue. The generalization of electronic payments offers financial institutions a breathtaking real-time view of economic activities and situations of entire societies. This truly represents a strategic advantage in times of economic turmoil. Especially when compared to the situation of governments, media and civil society who discuss measures that will have some impact only several months into the future on the basis of statistics reflecting situations often four months old in the past. Why only financial institutions could have as up to date data?

In one individual surveillance as in societal surveillance, the democratic principle should apply – in addition to that of freedom – with respect to decisions about social powers of such magnitude. Should parliaments recover the power to legislate publically on these matters? Or should we try to democratize the work of clubs such as the Canadian Association of payment? Or follow a different model of democratic decision making?


The second case is about internationalization of a public decision. This is about passports used to monitor border crossings of citizens, and often their movements within these boundaries. Design standards of electronic and biometric components of passports are taken in international forums, such as the International Civil Aviation Organization (ICAO), by senior public servants of the Member States surrounded by lobbyists of the airline and surveillance technologies industries. National parliaments often only have the choice to endorse or not the standards already established elsewhere.

Again, we must find a way to preserve the principle of democracy against such technocratic fait accompli through international bodies. Should parliaments or governments publicly pre-debate options to be offered in international forums? Should we engage a democratization of discussions in international forums to allow a real voice to citizens to be affected by decisions? Or a combination of both? Or another model?

These are the types of changes, needing to be outlined, that I propose to explore with you during the following discussion.

CommunicationsInformation & LawLab NotesNotes

Lecture on the right to accessible information

by Catherine Roy
Director General of the Centre de recherche et d’expérimentation sur l’inclusion numérique (Centre for research and experimentation in digital inclusion)

le droit à l’accessibilité des informations

(right of access to documents in media readable by everyone)
in the wake of the judgment in Donna Jodhan v. Attorney General of Canada

Lecture in French

A founding member of the HTML for all Working Group of the World Wide Web Consortium (W3C) and as well as of W3Québec (an organization promoting open standards and best practices for the web and multimedia), Ms. Roy will address in particular the respective roles of legislation and of technical standards in the evolution of law, here in regard to the accessibility of information.

Monday, April 11, 2011 from 18:00 pm to 19:30 pm, Room A-1720, UQAM (Hubert-Aquin building on 400 Sainte-Catherine East street, metro Berri-UQAM) as part of an Information Law course (JUR5512).

Free admission (the number of places being limited, please RSVP by email: peladeau dot pierrot @t uqam dot ca)

Jodhan c. A. G. of Canada
The legal news in the heart of the conference is the recent decision of the Federal Court in November 2010, v. Donna Jodhan Attorney General ofCanada, as amended by decision of January 2011 (French version:; English version: http:// The main issue was whether the federal government had violated the right to equality guaranteed by the Canadian Charter, either by setting inadequate technical standards for Web accessibility to information, or by not implementing existing technical standards.

Information Law

This course acknowledges that much of the legal regulation of interpersonal relations mediated by information handlings flows from adhesion contracts, technical standards as well as rules and procedures incorporated into the informational devices themselves.

Lab NotesLiving between the linesNotesObservationsReflections

Truthfulness of personal information as indicator of social morality?

ObservationsCan the level of accuracy of personal information items be indicative of the moral virtue of the social system in which the information is used?

This question came to me while I was doing some renovation at home while listening to Tapestry CBC One radio show. This week, Mary Hynes met Sam Harris in the wake of the publication of his book The Moral Landscape: How Science Can Determine Human Values. A surprisingly short interview given that this show’s usual practice is to devote its whole hour to a single personality or subject. By listening to Harris, one understands. He certainly offers a convincing argument about the ability of science to shed light on a moral issue, or even to decide between what is right and wrong. However, the fierceness of his attacks against religions quickly annoys, thus weakening his argument.

Still, neuroscience, for example, can objectively observe through scanner and hormonal analysis that, in general, an altruistic action provides wellness to human beings who do it as those who receive it. It also observes as exactly the opposite effect with a selfish action, that it is even worse for a malevolent action. Many developments in biology, ethology and ethnology as well as psychology and sociology do offer increasingly revealing insights on various moral issues. As Harris points out, science offers here the advantage to transcend cultures, religions and moral systems because of the provable and universal nature of its conclusions.

What with the quality of personal information? The short answer is that, on one hand, science is dependent on the quality of its data and that this quality often depends on the willingness or ability of human beings to tell the truth. Still on the other hand, the level of accuracy of the provided information is measurable… scientifically.

The anecdotal answer comes from to two recent observations about the necessity…  to lie. (more…)

Critique of CensusLiving between the linesNotesObservations

A Quantitative Methods Professional Answers Us…

Débats - DebatesThis is a response to previous post from a professional who wrote me, but do not wish to be identified for the moment:

Quantitative methods professional

The idea that a volunteer sample reduces the reliability and the validity of data is today as accepted an idea than the one that the earth is round. […] There are many articles that deal with the extent of the bias, its reasons, the ways that can be used to circumvent these biases somehow, etc. But one can never really succeed to circumvent them.

[As for Justice Boivin’s finding] I have not read the arguments in favour of a voluntary survey and how they think they can avoid the sample biases. Of course there is uncertainty about the reliability of data from the NHS, since we have never done this exercise before. There is one certainty about the fact that the data will be biased, but it is difficult to predict in advance the extent and nature of this bias.

Increasing the number of long questionnaires will not change the bias, and nothing leads us to believe that an advertising campaign can correct the bias. The campaign could very well increase it (especially if only in the two official languages). (more…)

Critique of CensusDebatesLiving between the linesNotesObservations

Questions for Statisticians and Specialists in Quantitative Methods regarding the Reliability of a Voluntary Census

In the wake of the decision on a application for judicial review form the Fédération des communautés francophones et acadienne du Canada

ObservationsFederal Court’s Justice Richard Boivin heard evidence and testimonies presented in support of and in opposition to the National Household Survey (NHS) which, being voluntary, replaces the old census long form, which was mandatory under fine and even imprisonment. The judge ruled this week that “there is uncertainty about the reliability of the data that will come from the NHS” … except that the Court is “not convinced that the data of the NHS will be so unreliable as to be unusable.”

Let’s recall that the Conservative government decided to remove the long form from the mandatory status of the Canadian census to make it voluntary instead. To offset a possible decline in participation, it provided an increase of around 50% of the number of long questionnaires (from 3 to 4.5 million households at an additional cost of $ 30 million) plus an advertising campaign to spur participation.

Many statisticians, demographers and researchers have criticized this decision. According to them, a voluntary survey would lead to a significant decrease in participation, particularly in certain portions of the population (the poorest, the least educated, of certain ethnic backgrounds). The result would be less representative and thus biased data which would distort the demographic profiles of country, regions and local communities. However, beyond these general statements, public interventions in the media so far have provided no statistical demonstration in support to this claim. Justice Boivin’s finding seems to confirm this perception.

So I make an appeal to statisticians and specialists in quantitative methods in order to clarify certain key elements of the debate. (more…)

Critique of CensusLiving between the linesNotesObservations

Canadian conservatives battling over the census… in USA

ObservationsI wrote in July that the Conservative government’s decision to abolish the compulsory nature of the census’ long form probably originated from an observation of the recent controversies surrounding U.S. Census as well as of the potential political risks and opportunities in their import in Canada.

Subsequent conservatives’ statements have amply demonstrated that the rationality of their decision was more one of partisan calculation than of administrative rationality or respect for the rights of citizens. Today, the Liberal Opposition tabled a bill to make the long form mandatory, still along with fines, but no more imprisonment. As if the opposition in the Commons blindly followed to the letter their role in one of the possible scenarios envisioned by the Conservatives.

However, the American inspiration for the strategy and the discourse in support has never been so clearly brought to light than by the statements of Minister Tony Clement on Tuesday. Jennifer Ditchburn of The Canadian Press reports that according to Clement, the enumerators could beat the system and make off with the personal information of Canadians. Although Statistics Canada has clear policies, “some enumerators are recruited in the same neighbourhood as respondents. This means, says Clement, “your neighbour may know some of your most personal and more intimate information.”

The minister described the situation here in the U.S. where, constitution requires, the census must be conducted by enumerators.

In Canada, the census is self-administered … since 1971. One reason for the abandonment of enumerators was specifically related to a matter of respect for privacy. It was less to avoid the risk of espionage, but to reduce the intrusiveness and the intimidating presence of a visit by a possible neighbour and therefore the bias resulting from the reluctance to answer questions honestly, even to answer at all. Indeed Statistics Canada’s policy for telephone follow-up reminders is to rely on enumerators who should not be from the area of citizen contacted.

In short, the cat is out of the bag. Latest Clement’s arguments are clearly American import copy and paste that have no relevance in the Canadian context. So gross an error would have not occurred if the conservative decision had been taken on the basis of some needs analysis to improve the Canadian census. Moreover, if one had wanted to improve the census, one would have amended the long form rather than spend the summer denigrating the questions it contains, and even those he does not…

Critique of CensusLiving between the linesNotesObservations

2011Census’ Theatre of Fears

Do you know someone who completed the census out of fear of fine or imprisonment? Or someone having not completed it who feared it? No? Then ask: What does Harper government fear?

Decision to transform the mandatory census’ long-form into a voluntary survey has led to genuine alarms. Scientists, business communities and local administrations dread deterioration of the data necessary to their work and decisions. Organizations acting for linguistics minorities, women and other communities worry about losing sound figures on which they base their advocacy.

However, accusations that Conservatives try to undermine the gathering of information that might contradict their policies are not plausible. It would be a dangerous game: skewed results from botched census could as much disserve them. It does not fit with a 50% multiplication of long questionnaires (from 3 to 4.5 million at additional cost of $30 million) plus a participation promotion campaign. Moreover, this government’s punctilious programs’ reviews and, especially, this Conservative Party’s wedge politics strategies require very reliable statistical benchmarks. (more…)

1 comment |
Powered by WordPress | Designed by Elegant Themes