Lab NotesNotesReading Notes

Reading Notes : Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction

About: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973).

Cover of: Cavoukian, Ann. A Primer on Metadata: Separating Fact from Fiction (Information and Privacy Commissioner of Ontario, Canada, 1973)

Tweets from Privacy by Design (@embedprivacy) signaled the publication of A Primer on Metadata: Separating Fact from Fiction (18 pages PDF document). As I am currently working on a related subject, I read it at once… and was disappointed. The actual primer on what is metadata is only two pages long, rather minimal, inaccurate and not quite convincing.

Metadata (formal definition):

Metadata is (…) essentially information about other information, in this case, relating to our communications.”

In this case : “Metadata is information generated by our communications devices and our communications service providers, as we use technologies like landline telephones, mobile phones, desktop computers, laptops, tablets or other computing devices.”

Cavoukian, 2013, p. 3

Metadata (descriptive definition) : Metadata includes information that reveals the time and duration of a communication, the particular devices, addresses, or numbers contacted, which kinds of communications services we use, and at what geolocations. And since virtually every device we use has a unique identifying number, our communications and Internet activities may be linked and traced with relative ease – ultimately back to the individuals involved.”

Cavoukian, 2013, p. 3


As presented in the document, these two definitions are at odds with one another: the formal one referring to information items about other information items; but not the descriptive definition which is rather referring to information about processes. But computer specialists do recognize many kinds of metadata, even though they might use different typologies.

The few lines entitled “A Day in the Life…” (pp. 3-4) provide a good illustration of how (processes) “metadata created by the devices that two individuals use to communicate with each other can reveal a great deal” about them.

Finally, the section “Metadata May Be More Revealing Than Content” (pp. 4-5) reads more like a series of arguments from authority than as an actual demonstration.

Need for evidenced arguments

Coincidently, answering engineering students in a lecture made at Polytechnique Montréal last week, I had to remind that an information set would be metadata, not by some intrinsic nature, but merely by the context of its initial production and use. Classically, the term data referred to information items that are available (or to be produced) for the solution of a problem or the completion of a reasoning, an inquiry or a research. As soon as one so uses “metadata” (what ever the type), they become “data”. Thus, no longer are “metadata”.

From the very first universal purpose computing machine, computers – and digital devices since – require metadata to work. And they also produce other metadata as by-products of their processes. And from the dawn of informatics, those metadata were at once reused as data.

There is nothing new with using metadata to produce knowledge about people. A classic example is the introduction of the computerized cash registers. As the machine processes the customers’ purchases, it produces clock metadata than can be used to asses the clerks’ speeds to punch (now scan) items, to take payments and give change, to pack the goods and pass to the next customers.

Anytime an operation is linkable to a human user, the operations’ metadata can be exploited as data about this human user (and anyone related to that person). Videogames provides good examples of how the same outputs can simultaneously be processes’ metadata and players’ data.

These relative artificiality and mutability of the distinction between data and metadata become obvious when one considers (as these tweet structure maps show) that making a tweet of a maximum of 140 characters can easily require the production of between 500 and 1000 characters of metadata which include… the tweet message itself !

And indeed, the “metadata”/”data” relative weights in todays’ particular instances can often be startling… if one can still distinguish between the two.

Also, need to make evidences evident

How come that there is no readily available button on which I could click to see the whole tweet actually produced, not only the message I wrote and sent?

Or how come that there is no readily available command to display what information my mobile phone service actually produces minute by minute?

And as I pointed out to Polytechnique’s engineering students: if NSA’s work is essentially done with computerized devices, how come Congress does not have a dashboard that harness the metadata about what kinds of operations NSA actually does? If such metadata would have been available, could Director James Clapper, been able to lie so easily about NSA’s operations before Congress? And Congress only discovering it through documents leaked by a whistleblower? After all, would it not be only metadata about systems’ uses, not data from the individual intelligence operations themselves? 😉

Such are questions of critical and practical political significance. Because they breed other questions about who decides the production of such information. About its uses. About who control them. About their consequences. And so on. Of critical and practical significance also because they could turn a defensive stance into one of political affirmation. Such questions stem from an understanding of the nature of what information and information processing are. This is why it is so important to deepen and strengthen such understanding as well as to popularize it and make it useable by all citizens.

So if you know any instructive work on the subject…

Lab NotesNotesReading Notes

A Few Notes from Luciano Floridi’s book The philosophy of information

From : Floridi, Luciano. The philosophy of information (Oxford [England]; New York: Oxford University Press, 2011).


Good problem / Open problem / Reseach Method

« Good problems are the driving force of any intellectual pursuit. Being able to do valuable research hugely depends on having good taste in choosing them. Now, for Hilbert, a good problem is a problem rich in consequences, clearly defined, easy to understand and difficult to solve, but still accessible. Again, it is worth learning the lesson, with a further qualification. We saw in chapter one that genuine philosophical problems should also be intrinsically open, that is, they should allow for genuine, reasonable, informed differences of opinion. Open problems call for explicit solutions, (p. 29 ) which facilitate a critical approach and hence empower the interlocutor. »

Floridi (2011) p. 28-29


Good problem / Reseach Method

« Hilbert thought that mathematical research has a historical nature and that mathematical problems often have their initial roots in historical circumstances, in the ‘ever-recurring interplay between thought and experience’. Philosophical problems are no exception. Like mathematical problems, they are not contingent but timely. »

Floridi (2011) p. 28


Solution / Explicitness / Rigor / Research Method

The more explicit and rigorous a solution is, the more easily it is criticizable. Logic is only apparently brusque. Its advice is as blunt as that of a good friend. The real trap is the false friendliness of sloppy thinking and obscure oracles. Their alluring rhetoric undermines the very possibility of disagreement, lulling the readers’ reason to sleep.

Floridi (2011) p. 28


Information / Epistemology

« The informational circle: How can information be assessed? If information cannot be transcended but can only be checked against further information—if it is information all the way up and all the way down—what does this tell us about our knowledge of the world?

The informational circle is reminiscent of the hermeneutical circle. It underpins the modern debate on the foundation of epistemology and the acceptability of some form of realism in the philosophy of science, according to which our information about the world captures something of the way the world is (Floridi (1996)). »

Floridi (2011) p. 40


Information / Epistemology / Model / Information Modelling / Reseach Method

« The semantic view of science: Is science reducible to information modelling?

The semantic approach to scientific theories (…), argues that

scientific reasoning is to a large extent model-based reasoning. It is models almost all the way up and models almost all the way down. (Giere (1999), p. 56).

Theories do not make contact with phenomena directly, but rather higher models are brought into contact with other, lower models (see chapter nine). These are themselves theoretical conceptualizations of empirical systems, which constitute an object being modelled as an object of scientific research. Giere (1988) takes most scientific models of interest to be non-linguistic abstract objects. Models, however, are the medium, not the message. Is information the (possibly non-linguistic) content of these models? How are informational models (semantically, cognitively, and instrumentally) related to the conceptualizations that constitute their empirical references? »

Floridi (2011) p. 41


Data / Information / Materialism

« Wiener’s problem: What is the ontological status of information?

Most people agree that there is no information without (data) representation. This principle is often interpreted materialistically, as advocating the impossibility of physically disembodied information, through the equation ‘representation = physical implementation’. (…) Here, let me stress that the problem is whether the informational might be an independent ontological category, different from the physical/material and the mental, assuming one could draw this Cartesian distinction. Wiener, for example, thought that

Information is information, not matter or energy. No materialism which does not admit this can survive at the present day. (Wiener (1948), p. 132)

If the informational is not an independent ontological category, to which category is it reducible? If it is an independent ontological category, how is it related to the physical/material and to the mental? »

Floridi (2011) p. 42

ExperimentsInformation & LawLab NotesNotesReflections

Abandoning the concept (and illustration) of “information collection” for that of “production”

In its original 1990 version, the theory of interpersonal information processes refers to collection as one of information’s logical phases. The term collection is borrowed from protection of personal information law, which itself borrowed it from the lexicon of public and private bureaucracies. However, the word collection (action to pick a pre-existing object) masks the presence of a production of new informational artifacts. The result is that several implications are veiled, particularly those related to the intellectual property of the new information objects and to their pragmatic dimension.

The question then is: should collection really be considered as a logical phase of information? Or is it the chosen term that is inadequate? (more…)

"Beyond Privacy" ProjectCommunicationsLab NotesLiving between the linesNotesReading Notes

2012 Map of a Twitter Status Object for Dummies

Provisional book cover: Title :

This post is about the “Beyond Privacy” Project: LIVING BETWEEN THE LINES information society through our personal information.

As this is an open work-in-progress book drafting project,

please do not hesitate to comment!

Every input is precious to help improve it.

Many probably have seen the Map of a Twitter status object below. Produced by Raffi Krikorian, from Twitter’s engineering department, this one-page chart quickly became popular. This was because it illustrated in a single image that a Twitter message was not a mere line of text up to 140 characters.

Although this document and its annotations are addressed primarily to API developers, it had a strong educational value. I have used it often. You had to see how wide the eyes of information law students opened in surprise and curiosity! That chart made easy to pass on the message we must do our homework when assessing informational practice. That we not be satisfied with only the visible information items and processes. That we must understand what actually happens in the black box. Even ask a hand to computer technologists.

I was writing a new book chapter entitled “Production Inputs“. It explains that handling of information objects allows us to produce new ones. However, this task requires, often without our realizing, the production of even further information objects, either to carry it out, or to describe it. The example of the 140 characters tweet which, in fact, features thousands of characters of code lines seems great to illustrate this point.

So I undertook to produce a new chart that would be updated, clearer as well as, more easily readable and understandable by non-specialists.

Partial List of Information Items Linked to a Tweet (small)

The result is this chart spreading over two pages. But it would have taken three to be exhaustive. Please, click the following to access :

Among many things, this exercise revealed to me the existence of fields for blocking messages or entire users’ accounts at the request of public authorities, of holders copyright, or of others. It also revealed that this map is not only that of a tweet, but also of all the information items coproduced with it. To the extent that all these items are available in practice, the distinction is perhaps only one of nuance. From a pedagogical point of view however, this is worth mentioning.

Further revelation, I also found a few syntax, description and field’s status typos in the original chart from Krikorian. Far from being a Twitter engineer, I would be very grateful if you would signal to me any typo or error in the new chart proposed here.

"Beyond Privacy" ProjectCommunicationsLab NotesLiving between the linesNotes

“Beyond Privacy” Project: The Mandatory Multiplication of Electromagnetic Information Loaves


Provisional book cover: Title : "Living Between The Lines: Information Society Through Our Personal Information" Mentions: "Beyond Privacy Project : An open work-in-progress"

This post is about the “Beyond Privacy” Project: LIVING BETWEEN THE LINES information society through our personal information.

As this is an open work-in-progress book drafting project,

please do not hesitate to comment!

Every input is precious to help improve it.


Utility vehicles


Information objects allow us to interact across time and space. This capability varies depending on the physical support. The difference becomes obvious between solid matter and electromagnetic waves.

How would it feel for you to pull out a banknote and burn it?

If a twinge of lost, the source is not the combustion of a fraction of a gram of matter. If pleasurable excitement, it does not result that much from the momentary flame.

The emotion comes mainly from the irreversible loss of information items. Not just any ones! The vaporized in smoke writing conveyed a unit of value that we could share with others.

The destroyed information allowed us to get from other persons a good or a service. Or to repay them a debt. Or to hand them an assistance. Or offer them a gift.

Also vanished is the ability to offer to ourselves a gift.

Hence the emotion produced. We have forever destroyed information items representing a fragment of power in the human world.
Cover of the Voyager probes' golden disk. A circular plate on which are engraved the instructions to play the disk and a map of the location of the solar system.


"Beyond Privacy" ProjectCommunicationsLab NotesLiving between the linesNotes

“Beyond Privacy” Project: Glossary

Provisional book cover: Title :

This post is about the “Beyond Privacy” Project: LIVING BETWEEN THE LINES information society through our personal information.

As this is an open work-in-progress book drafting project,

please do not hesitate to comment!

Every input is precious to help improve it.



Information : The word receives quite different definition depending on the uses and the fields of practice. In this book, this word designates a material support for the conservation, communication and processing of knowledge or signals, particularly one those that give a form to an interpersonal relation.

"Beyond Privacy" ProjectCommunicationsLab NotesLiving between the linesNotes

“Beyond Privacy” Project: Chapter on the Material Reality of Information

Provisional book cover: Title :

This post is about the “Beyond Privacy” Project: LIVING BETWEEN THE LINES information society through our personal information.

As this is an open work-in-progress book drafting project,

please do not hesitate to comment!

Every input is precious to help improve it.

Chapter from Part One: Alignment: Objects Called “Information”

Material Strength


Digital information items are objects of which we entrust the handling to machines. Often microscopic, such information objects and handlings then can become invisible to us.

Many have claimed that we are witnessing a dematerialization of human activities.

Dematerialization of the economy? It is true that increasing shares of production and commerce consume less matter and energy. One share consists of “intellectual” services: marketing, research and development, consulting, training. Another share deals with digital products which may be transported electronically.

Dematerialization of money? Of finance? Or of information in general? Also true. Everywhere, paper is being replaced by powerful electronic media.

Unfortunately, many are those who thought that it was literal dematerialization. Complete disappearance of matter. Such dematerialization would imply that information items are immaterial entities. The huge Internet infrastructure would be a sort of intangible cloud. Some cyberspace would be developing in some parallel universe whose properties fall outside those of the physical world. State legislation would be practically unenforceable there. Information flows would be insensitive to national borders. Any ambition to control these flows would prove illusory. (more…)

"Beyond Privacy" ProjectCommunicationsLab NotesLiving between the linesNotes

“Beyond Privacy” Project: Chapter Defining “Information” by Using a Slinky

Provisional book cover: Title :

This post is about the “Beyond Privacy” Project: LIVING BETWEEN THE LINES information society through our personal information.

As this is an open work-in-progress book drafting project,

please do not hesitate to comment!

Every input is precious to help improve it.

Chapter form Part One: Alignment: Objects Called “Information”

High Definition


The word “information” is part of our everyday language. But it means too many different things. A careful exploration demands that we first settle on a common definition.

Falling Slinky: Experiment showing that bottom end of a Slinky in free fall will float until the above coils come to it

Literally, to inform means “to give a form” to something. This was the sense of its 2000 years old Latin ancestor, informare. It was also used to say “to get an idea of” something or someone. (more…)

1 comment |
Powered by WordPress | Designed by Elegant Themes