Capitalmind
Capitalmind
Actionable insights on equities, fixed-income, macros and personal finance Start 14-Days Free Trial
Actionable investing insights Get Free Trial
General

Making Data Open For India

Nikhil Pahwa makes the case for Open Data In India.

Earlier today, we received an email from a Government of India organization (we’ve been advised not to name them), asking us to remove certain posts that we had published, on the basis of data published online by them. We’ve been publishing our analysis of publicly available Government data relevant to the Internet and Telecom domains since our inception almost three years ago, and are now pro-actively seeking more data from the Government for our data journalism initiative MediaNama Charts; Our take on this unexpected take-down request:

– We do not believe that raw data published online by Government organizations, for citizens to peruse, is or should be under copyright.
– Our act of analysis and reportage on the basis of that data is not a case of copyright violation, and any move to prevent us from publishing the data or asking us to remove it impinges on our freedom to report on developments, as a media organization.
– We could have acquired the same data by filing Right To Information Act requests, and published the data, so why ask us to remove it?

I would be horrified if the Indian Government actually tried to defend the copyright that they may or may not actually have (since that data is available for public domain use through RTI anyhow) . I use government produced data often, and it would be a shame if they refused to let us actually use the data, while at the same time publishing them for public consumption. They could try and go to court and that is a deterrent in that many of us don’t have the money or time to fight a court case – but if it comes down to that I’m happy to share in the costs and time, regardless of who is attacked.

What Nikhil and his team do is analysis on top of the data itself. The graphs, charts, tables and so on are derivative works by definition, and once you have data out there you can’t copyright the way it is used – as my knowledge goes, you can only defend access to it as a trade secret, but by nature, any data revealed on a public government website is not a trade secret.

The other aspect is to clean up such information and provide it in a different format. This is where lines are murky because if you own the copyright to something no one can just change the delivery mechanism just like that. So I can’t, for instance, take a book, scan it and say that because I am delivering it digitally, I am adding value or creating a derivative work. But if you took some data – say TRAI data from PDFs – and put it in an XML or JSON downloadable format, would you be adding value? I think you would, because you now allow so much more to happen with that data; from using it in Excel to doing different kinds of visualizations. But there may need to be a court precedent for this – if you know of any, please do let me know.

The final defence is the freedom of the Press, which might be the winning factor here. But that would still be a shame, because there is no reason non-media entities like NGOs, Viz specialists or even consultants cannot use that data. This data is paid for by the taxpayers of India, and therefore belongs collectively to all of us. Where it is revealed, anyone should be able to use that data – numbers, charts or whatever. If RTI was a big step, let’s take the smaller step of making our government data open.

  • Anonymous says:

    >I have the opposite viewpoint – the RTI was well intentioned but the wrong way to achieve transparency. All public data should have been open to start with (except defense, sensitive stuff etc, some of which can be released after a few years/decades). RTI would then have been unnecessary. We should have had a Right to Open and transparent government instead of RTI (which puts the onus on citizens to request information). Instead the onus should have been on government to provide transparency and open data.

  • Anonymous says:

    >It would also be interesting to see what the expectancy is for the next week, month, etc. Next day return is too specific and conclusions might not be as reliable as compared to expectancy for longer periods.