« The Open Data Room of the Banque de France »

Christian Pfister has been Deputy Director General of the Banque de France's Directorate General Statistics since January 2013, following his previous appointment as Deputy Director General of Economics and International Relations, and has been the President of the French Bond Association since October 2014. Christian also lectures at Sciences Po, teaching a course in Financial Stability with Françoise Drumetz, and his work has been published in the fields of monetary policy, monetary unification, financial stability, savings and financing, international monetary economics, statistics, the labour market and structural reforms. He is a graduate of HEC and Sciences Po.

1 - Could you briefly tell us about the DGS and the databases it has at its disposal?

The DGS collects, analyses and disseminates statistics on a broad range of subjects.

Within the framework of international or European regulations, it compiles the monetary and financial statistics, balance of payments and international investment position. To this end, the DGS collects data on deposits, credit, interest rates, and issuance and holdings of securities from financial institutions, particularly banks, as well as information from non-financial companies. All of these data are made available to the public in the form of regular, concise analyses referred to as "Stat Info" or in the form of reports. They are also accessible as aggregated series that can be downloaded from the Banque de France website's web portal, Webstat.

The DGS uses these same channels to disseminate data collected within a more strictly national framework by the Banque de France's other business areas, such as the Central Credit Register's data on business lending and the results of economic surveys conducted by the network, which are notably used in the preparation of the GDP forecast.

 

2 - What is the Open Data Room and who is involved in this project?

The Banque de France grants free, effective access to its confidential data – once they have been de-identified and, where necessary, stripped of other information – through a secure Open Data Room. Prior approval must be obtained from the Data Access Committee (Comité d’examen des demandes d’accès aux données de la Banque de France). This committee is composed of Banque de France representatives and experts from academia, who are responsible for ensuring that the request relates to a scientific research project for publication purposes only. At end-2017, almost 600 million data series are accessible to researchers.

The Open Data Room is managed by the DGS in partnership with the various Banque de France business areas that supply the data that are made available.

 

3 - Where does the Open Data Room fit in with the DGS's and the Bank's other data diffusion techniques?

The Banque de France disseminates 30,000 aggregated series on its website via the Webstat portal. It also provides statistical series to European and international organisations such as the ECB, the IMF, the OECD and Eurostat.

The aim of the Open Data Room is to give external researchers access to confidential data in an anonymised format, as required by the regulations that govern their collection.

 

4 - Who is the Bank trying to target by opening up access to these data via the ODR?

The Open Data Room, or "ODR", is for public or private-sector research related to non-commercial projects carried out for scientific purposes by non-Banque de France research teams or mixed teams of external researchers working with Banque de France analysts.

 

5 - What data are available to researchers? And how far back do they go? What are their periodicity?

The individual data that are made available are extremely wide-ranging in terms of the economic sectors that they cover and are extremely detailed with regard to transactions. Researchers can thus consult the balance sheets and income statements of credit institutions, and interest rates for their deposits and loans. Daily interbank transactions of French banks become available one year after they are carried out. Securities data concern the issuance of debt securities and holdings, on a security-by-security basis, for each resident institutional sector. With regard to non-financial companies, information such as tax returns, individual bank loans and default and failure statistics are available, as are overindebtedness data for households. Cross-border transaction data cover direct investment and portfolio investment flows, movements in bank loans and deposits, and the economic transaction flows of non-financial companies.

As for when the records begin and their time intervals, it depends on the database. Generally, all the returns available that are of sufficient quality are provided. Certain databases can become so large that manipulating the data becomes difficult. When this is the case, for the researchers to obtain optimal results, they have to select variables that match their needs as closely as possible.

 

What procedures must be followed to get access to the data?

Researchers must complete the form that can be downloaded at https://www.banque-france.fr/statistiques/acces-aux-donnees-granulaires and send it to the following e-mail address: DGS-DIMOS-acces-donnees-ut@banque-france.fr. Each application for access to the data is evaluated by the Data Access Secretariat and methodological specialists. The researchers are contacted in order to ascertain precisely what type of data is required in light of their research objectives.

The access to data and the conditions of use are subject to the prior agreement of the Data Access Committee (see point 2).

 

6 - Does the DGS or the Banque de France vet the research subject or subjects when granting access to ODR data?

The Data Access Committee does not judge the merits of the project and does not vet the study topics. However, it can assess the relevance of the Banque de France's data to the project's objectives. After all, making the data available involves the mobilisation of significant resources by the Banque de France, including the teams and the specialists required for the investigation and to format the data. It also requires state-of-the-art IT resources. Consequently, it is important that these resources are used effectively.

The Data Access Committee also pays particular attention to ensuring that it is impossible to attribute any sensitive data to an identifiable economic agent, even indirectly. This legal risk is doubly covered by the anonymisation of data and the requirement that each member of the research group sign a confidentiality agreement.

 

7 - Does the DGS or the Banque de France reserve the right to vet the research results? In other words, is there a quid pro quo, such as priority of access to the research results, for the data being made freely available?

Again, the Bank cannot vet the content or relevance of the results. It simply ensures that the confidentiality of the data provided is respected and that the resources mobilised by the Bank are put to good use.

The results are therefore subject to an official verification by the Secretariat and statistical specialists to ensure that no individual entity is identifiable in the data that are produced prior to passing them on to the researcher.

In order to verify that the costs incurred by the Bank result in a real benefit for the community, we recently started asking researchers to send the Banque de France the papers they wrote on the basis of the Bank's data once they are presented at a seminar or sent to journals.

 

8 - Is it possible to match other sources of data with those of the DGS in the ODR?

If requested by the researcher, data related to the same entity and available in several Banque de France databases can be anonymised using the same anonymisation key. The same procedure can be applied between the Bank's databases and researchers' external databases, provided they have the right to import them in the ODR and that the data do not become indirectly identifiable as a result of the cross-referencing.

 

9 - What tools are available to researchers who have been granted access to the ODR?

Three workstations are available on site, each equipped with Anaconda3, MATLAB, Demetra, Notepad++, R, RStudio, SAS and Stata 14 software as standard.

 

10 - What are the practical aspects of accessing the data?

The Open Data Room is located in the premises of the Banque de France – Directorate General Statistics, at 37 rue du Louvre, 75002 Paris, and is open from 10 a.m. to 6 p.m., Monday to Friday. The first time the research group comes to the premises, the Data Access Secretariat walks the researchers through the use of the technical tools and databases and ensures that they have access to the associated methodologies in particular. Throughout their work, the Data Access Secretariat is available to provide assistance to the researchers, who can also contact the statistical specialists responsible for the databases.

 

11 - Have procedures for updating the data already been envisaged?

For databases to be updated, a request for a project extension must be submitted and approved by the Data Access Committee. The same applies for requests for additional data and changes to the research team.

 

12 - Without mentioning names, can you tell us about the entities that already benefit from the Banque de France's offer to provide access to sets of its data?

On the whole, they are universities, prestigious educational establishments and public institutions.

 

13 - Could the databases also be opened up to foreign researchers?

Yes, we have already granted access to data to foreign research groups, and we are even considering setting up a Banque de France data access site at its premises in New York in 2018. Finally, the Banque de France is involved in INEXDA, an international project for exchanging experience on statistical handling of granular data for research purposes. Participation in INEXDA is open to all central banks, national statistics offices and international organisations.

Mis à jour le : 05/01/2018 11:38