Home Fostering Excellent Research by the Austrian Micro Data Center (AMDC)
Article Open Access

Fostering Excellent Research by the Austrian Micro Data Center (AMDC)

  • Regina Fuchs , Tobias Göllner ORCID logo , Simon Hartmann and Tobias Thomas ORCID logo EMAIL logo
Published/Copyright: July 4, 2023

Abstract

Access to high quality microdata is a precondition for the empirical investigation of many interrelationships in the economic and social sciences. Therefore, well-functioning research data infrastructure is a cornerstone of a successful science location. While other countries in Europe, such as Denmark and the Netherlands, have had microdata centres at their respective National Statistical Offices for quite some time, microdata access for research purposes in Austria was very limited for a long time. Established in 2022, the Austrian Micro Data Center (AMDC) at Statistics Austria enables researchers of accredited research institutions to work with pseudonymized microdata on individuals and firms. The available microdata includes not just microdata of Statistics Austria but also registry data of the Austrian federal government. The main novelty is that microdata can be linked deterministically to each other via unique pseudonymized identifiers among data sets of Statistics Austria, administrative registers, and also to microdata brought in by the researchers themselves. The AMDC is operated by Statistics Austria and its services are open to research institutions worldwide.

JEL Classification: C0; D0

1 Introduction

Internationally, National Statistical Offices have played a major role in establishing access to administrative microdata for research purposes.[1] Examples include Statistics Denmark,[2] Statistics Finland,[3] and the Centraal Bureau voor de Statistiek (CBS) in the Netherlands.[4]

In Austria, microdata access for research purposes was very limited for a long time. However, in 2022, a change in legislation enabled access to microdata for research entities comparable to the European forerunner countries. Before this change, Austria’s National Statistical Office, “Statistics Austria”,[5] was not authorized to provide access to data on the level of individuals or firms to researchers. Consequently, it was often not possible to conduct empirical tests of important research hypothesis and identify causal inference using Austrian data. Thus, Austria was at a disadvantage as a science location and it was difficult for research entities to provide evidence-based scientific advice to Austrian policy makers in many fields. In contrast, more than 1500 scientific projects and publications were published from 2006 to 2022 using Dutch microdata provided by the Centraal Bureau voor de Statistiek (CBS).[6]

Since 2022, following an amendment to the Federal Statistics Act (German: Bundesstatistikgesetz, BStatG, see Section 3), the legal basis for remote access to indirectly identifiable microdata for scientific purposes was established. With the amendment, Statistics Austria established a new research data infrastructure, Austrian Micro Data Center (AMDC), which opened July 1, 2022.

The aim of the Austrian Micro Data Center (AMDC) is to provide a central, data protection-compliant remote access to statistical registers, as well as to other microdata of federal governmental entities for empirical research. In this sense, a one-stop shop for scientific purposes has been created and, by doing so, an essential research data infrastructure established.[7] According the Federal Statistics Act (BStatG §§ 31, 32), the basic infrastructure of the AMDC is financed by the Austrian Federal Ministry of Education, Science and Research (BMBWF), whereas the variable costs must be borne by the research entities. This highlights the importance of competitive research calls for registry research.

The establishment of the AMDC strongly follows existing best-practices models, in particular microdata accesses by the Centraal Bureau voor de Statistiek (CBS) and Statistics Denmark. This paper describes the application process and project work with the AMDC (Section 2), provides an overview of the microdata accessible in the AMDC (Section 3), and gives an outlook on possible further developments (Section 4).

2 Application Process and Project Work

2.1 Accreditation of a Research Entity

To meet the legal requirements and to gain remote access to microdata for scientific purposes, the applying research entity must meet several requirements. These requirements are defined in the Federal Statistics Act (BStatG § 31 (7)) and include:

  1. Conduct research at university level and make the results available to the public free of charge.

  2. Be an organisation with legal personality, with a primary focus on research.

  3. Be independent and autonomous in scientific activity and in formulating scientific conclusions.

  4. Fulfil the technical and infrastructural requirements with regard to guaranteeing data security.

These requirements must be delivered in the process of official registration of a research organization with the AMDC. After reviewing the documents, the research organization is awarded an official confirmation of accreditation. On top of these institutional requirements, the members of these organizations must commit themselves to the strict data protection measures of the AMDC, which include no re-identification of individuals or firms and only analysing microdata for research purposes.

The Federal Statistics Act (BStatG § 31 (8)) lists a number of scientific institutions that meet the first three requirements (BStatG § 31 (7)). Within the first 12 months, a considerable number of more than 40 national and international research organizations successfully applied for accreditation with the AMDC, including Vrije Universiteit Amsterdam, University of Gothenburg, and almost all Austrian universities as well as national and international research institutes like the Austrian Institute of Economic Research (WIFO), Complexity Science Hub Vienna (CSH), Geneva Graduate Institute, ifo Institute – Leibniz Institute for Economic Research Munich, Institute for Advanced Studies Vienna (IHS), and the Halle Institute for Economic Research (IWH).[8]

Accreditation is typically valid for five years. Only in case of expiration after five years or if there are significant changes, for instance in the legal structure or the main activities of an organization, does the entity have to re-apply for accreditation.

2.2 Application for a Research Project

The application for a research project is open to any employee of a research organization accredited by the AMDC,[9] Similar to the accreditation process, the project proposal to gain access to the AMDC is implemented via the AMDC online application.[10]

For legal reasons, the application for a research project must include a research proposal and a justification to access the requested data. In more detail, an AMDC project has to include:

  1. a title and a brief description of the aim, research question and/or main hypotheses;

  2. analytical methods and expected results;

  3. a detailed justification of the selection of data sets and variables in reference to the aims, questions and hypotheses;[11]

  4. information on the researchers who want to access and work with the data (e.g. proof of employment with the accredited scientific institution); and

  5. the timeframe of the research project.

The AMDC reviews the proposal, checking both data protection concerns and the project feasibility. In this step, the AMDC not only checks all legal requirements but also ensures that that the selected data sets and variables are compatible with respect to sample overlap (over statistical units and time) and external identifiers.

After reviewing the whole application, the AMDC provides feedback to the researchers.[12] At this stage, revisions to the proposal are possible. After all formal requirements are fulfilled by the research proposal and the researchers, as well as the approval of the final research proposal by the AMDC, the AMDC provides a formal offer for data access, which includes detailed costs to be borne by the research entity.[13] If the research entity accepts the offer, the contract will be concluded. Before this formal offer, it is also a possible to get an estimate of the costs of the research project, for instance to use it in the course of grant applications or for other funding opportunities. After successful application for remote access to microdata, the access can be provided for a maximum duration of 5 years.

A notable asset of the AMDC is that a research entity can request not only microdata from one microdata set, but of any combination of data sets available from the AMDC. The huge advantage of the AMDC is that, via deterministic linking, different microdata sets can be connected to one another. For instance, when working with person data in the AMDC, the deterministic linking is operated by a specific encrypted identifier (German: verschlüsseltes bereichsspezifisches Personenkennzeichen Amtliche Statistik, vbPK-AS)[14] that is provided by the Austrian Identifier Registry Authority (German: Stammzahlenregisterbehörde).[15] The pseudonymization with the vbPK-AS is the precondition for both the protected data use and the deterministic linking of the data on persons within the secure environment of the AMDC. Hence, only data that are pseudonymized with vbPK-AS can be processed. When working with enterprises, Statistics Austria itself is responsible to create an encrypted enterprise identifier (German: verschlüsselte Unternehmenskennzahl). If researchers provide data on companies with a usable identifier (e.g. the enterprise number, German: Firmenbuchnummer) to Statistics Austria, this identifier is replaced with the encrypted enterprise identifier. Thereafter, the data can be linked to other microdata sets that also use this identifier within the secure environment of the AMDC.

2.3 Research Project Work

The AMDC provides data access via a secure Remote Research Environment (RRE). In no instance is microdata ever sent to researchers; rather it is only available within the RRE, which is located on servers of Statistics Austria.

Hence, after researchers sign the contract stating that all the conditions for the project (including costs, start and duration of the research and commitment to data protection) are met, the AMDC prepares an RRE for the research project and compiles the requested microdata. After the completion of all preparations, users will experience the onboarding process, which is a technical introduction for using the RRE. The AMDC connects the researcher to the RRE via a Virtual Desktop Infrastructure in a “terminal server” solution, similar to those employed by the microdata centres of the Netherlands, Finland and Denmark (Reuter and Museux 2010). Logging and researching via the RRE is possible 24/7 during the whole timeframe of the research project except for scheduled or urgent system maintenance work.

The physical entry point to the AMDC RRE – according to Federal Statistics Act (BStatG § 31 (7 4)) – requires researchers to be located in a separate and lockable room at the accredited research entity. The main objective is that there must be no risk of unauthorized viewing of data or observation of research activities. In fact, this means that the access point may not be located in an open or public space at the research entity and that home office use of the AMDC is not possible under the current legislation.

Researchers are unable to add or remove any software or data to the RRE by themselves. The AMDC provides statistical and data analytics software, which includes, as of 2023: SPSS, Stata, R (RStudio Desktop) and Python (Spyder). Upon request by the research entity, special statistical software can be installed for a fee covering the corresponding costs. Additionally, LibreOffice, Jupyter Notebook and a text editor are available. Import of external data or analyses code provided by researchers is conducted by the AMDC after a security and data protection check. Finally, when fulfilling requests to export intermediate results of research projects (e.g. writing a research paper), the AMDC checks all tables and graphs with respect to the strict data protection guidelines (“output control”). Simply put, only outputs where no individual and/or firms are (indirectly) identifiable are permitted to leave the RRE.[16] Once they pass output control, the outputs will be provided to the research entity via a secure data exchange service.

After the research phase, the data and all scripts and logs of the research project will be archived and stored for five years for the purpose of possible revisions (e.g. during journal review processes). For the purpose of replicability of results, this time period can be extended. The researchers have to cover the costs for the extended storage of all files. During the storage period researchers can apply for access to their archived files.

3 Accessible Microdata

In general, the AMDC provides access to a wide range of microdata derived from three different sources (for the structure of the AMDC see Figure 1):

  1. Microdata of Statistics Austria

  2. Microdata of the Federal State

  3. Microdata of the Research Entity

Figure 1: 
Structure of the AMDC.
Figure 1:

Structure of the AMDC.

First, the AMDC grants access to microdata of Statistics Austria. This microdata includes not only the microdata generated by Statistics Austria by survey but also include data from many administrative registers that are used by Statistics Austria for the production of official statistics, such as the Central Population Register (German: Zentrales Melderegister, ZMR) as well as the complete corporate, income and value added tax data (for an overview over the microdata of Statistics Austria provided in the AMDC, see Table 1 in the Appendix).[17] In addition, research entities may request access to additional administrative data from the federal state based on the Research Organization Act (German: Forschungsorganisationsgesetz, FOG, § 38b). The rules in this second track of data access are different, as the precondition for the access to these data are FOG regulations by the responsible ministries together with the Federal Ministry of Education, Science and Research (German: Bundesministerium für Bildung, Wissenschaft und Forschung, BMBWF). In this case, in line with its one-stop-shop-approach, the AMDC will forward the data request to the responsible federal ministry, acting as a liaison with the research entities and communicating decisions rendered by these ministries.

Table 1:

Microdata sets of Statistics Austria in the AMDC.

Name Data collection method Years
Household budget survey Survey 2019/2020
Job vacancy survey Survey 2018–2022
Micro census/special programme labour force surveys Survey 2004–2021
Micro census: Labour force survey – ad hoc module Survey 2016–2022
Adult education survey Survey 2016/2017
University statistics Register 2003–2021
School statistics Register 2006–2022
Educational attainment Register 2015–2020
Register-based labour market career Register 2010–2022
Vital statistics – birth statistics Register 2015–2021
Vital statistics – death statistics Register 2015–2021
Marriage and registration of registered partnerships Register 2015–2021
Divorce and dissolution of registered partnerships Register 2018–2021
International migration Survey 2002–2021
Migration within Austria, internal migration statistics Survey 2002–2021
Quarterly population statistics Survey 2002–2022
Austrian health interview survey (ATHIS) Survey 2019
Livestock survey (pigs, sheep, goats, cattle) Survey 2020/2021
Statistics on driving licences, statistics on driving authorizations Survey 2006–2020
Service and structural surveys – firms Survey 2018–2020
Service and structural surveys – legal units Survey 2008–2020
Service and structural surveys – plant Survey 2008–2020
Service and structural surveys – work place Survey 2008–2020
Use of goods in the manufacturing sector Survey 2002–2020
INTRASTAT foreign trade statistics (EU foreign trade) Survey 2012–2021
EXTRASTAT foreign trade statistics (non-EU foreign trade) Survey 2012–2021
Economic survey manufacturing – legal units Survey 2014–2021
Economic survey manufacturing – plant Survey 2014–2021
PRODuction COMmunautaire (community production) Survey 2014–2021
Business cycle in the manufacturing sector Survey 2014–2021
Companies register (article 25) – legal unit Register 2013–2022
Companies register (article 25) – statistical unit Register 2013–2022
Companies register (article 25) – workplace Register 2013–2022
Companies register (article 25) – group Register 2013–2022
Payroll tax statistics Secondary 2020
Sales tax/turnover tax statistics Secondary 2000–2019
Income tax statistics Secondary 2000–2019
European union statistics on income and living conditions Survey 2020
Register of buildings Register 2002–2020
Integrated payroll and income tax statistics Secondary 2000–2019
Advance turnover tax return statistics Secondary 2008–2021
Teaching staff at public universities Register 2005–2022
Corporate tax statistics Secondary 2005–2017
Cancer statistics/Cancer register Register 1983–2019
ICT usage in enterprises Survey 2019–2022
ICT usage in households Survey 2019–2021
R & D in the public sector Survey 2009, 2011, 2013, 2015, 2017, 2019
R & D in the business enterprise sector Survey 2011, 2013, 2015, 2017, 2019
Co-ordinated employment statistics Register 2008–2020
Programme for the international assessment of adult competencies (PIAAC) Survey 2011/12
Employer demographics Register 2007–2020
General firm demographics Register 2007–2020
Innovation survey Survey 2012, 2014, 2016, 2018, 2020
High growth enterprises Register 2008–2021
Commodity statistics Survey 2002–2020
Road transport statistics Survey 2014–2022
Foreign affiliates statistics (FATS) Survey 2008–2020
Bankruptcy Register 2019–2022
Customer survey Survey 2019/2020
Mobility and cooperation in high education Register/FOG 2009–2022
Austrian foreign affiliates statistics (ÖFATS) Survey 2008–2020
Examination activity in public higher education Register/FOG 2018–2021
Registry of legal unit Register 2019–2022
Railway commodity transport statistics Register 2021, 2022
Holiday and business travel Survey 2019–2021
Advance turnover tax return statistics Secondary 2008–2021
  1. For further details see https://www.statistik.at/amdc-data/ (accessed May 26, 2023).

The final data track is microdata provided by the research entities themselves, which can be linked to the other data. Precondition for the use of this data source is the removal of all direct identifiers and the pseudonymization with a specific encrypted identifier (e.g. vbPK-AS; see above) or any available firm identifier compatible with AMDC data. Individual and firm level pseudonyms are used by the AMDC to make data sets linkable.

The AMDC microdata catalogue holds time series starting from the early 2000s. Researchers’ requests for longer observation periods cannot be met in the foreseeable future as the legal foundation for safely linking data by the encrypted identifier (vbPK) was only established in 2004; thus the implementation of the unique identifiers in all areas of data gathering is, as of 2023, still an ongoing process. Nonetheless, the data available at the AMDC is rather comprehensive, in particular with respect to the variety of topics. In terms of person and household characteristics, the research opportunities in the AMDC are quite extensive: The Austrian Census, consisting of the Population Census, the Housing Census and the Census of Local Units of Employment, is available since 2011. Given its register-based nature, since then the majority of variables are produced on an annual basis. Income tax statistics and education related statistics add to valuable information based on registers. In addition, data on COVID-19 vaccinations and data from COVID-19 infections will be available in the AMDC creating research opportunities to analyse public health measures during the pandemic and the long-term effects of a COVID-19 infection.

With respect to business statistics, with data including the companies register, corporate tax statistics, trade statistics and foreign affiliates statistics, remarkable opportunities are provided within the AMDC to conduct excellent business-related research, for instance when it come to the analysis of the development of productivity or factors of success. The coordinated employment statistics allows for linking firm level data to individual level data. In this regard, it enables researchers to analyse data on the firm and employer levels, while simultaneously changing perspective and including individual, family and household levels (see Table 1).

4 Status and Outlook as of 2023

The potential for further development of the AMDC is very promising, for instance by linking existing and new data sets or expanding the data available to new data sources and new topical areas. First, within the AMDC it is possible to link individual level data (e.g. employees) to firms (employers) (Abowd and Kramarz 1999; Goetz et al. 2015, Weinhardt et al. 2017). Here, the AMDC can rely on unique identifiers, which can consistently be used for past, current and future data. Second, in terms of the topical coverage of the AMDC, it is expected that the microdata sets will expand to include a number of health and socio-economic characteristics. The sources include data from the public administration, data from research entities and even (tailor-made) survey data to fill current gaps.[18] In this context, one highlight is the Austrian Socio-Economic Panel (ASEP), which will start its full operations in 2024. The aim of ASEP is to establish a longitudinal household panel comprising an annual household survey that is complemented with register-based data, thus allowing researchers to fully benefit from the linkage of numerous, already existing and new data sources from public administration and data held by Statistics Austria.[19]

The introduction of a domain specific unique personal identifier (e.g. the vbPK-AS for Statistics Austria) in the Austrian public administration, legally implemented in 2004,[20] opened doors to link data within and across the Austrian public administration and, in turn, for academic research. Since implementation, the identifier is gaining increasing importance in administrative bodies. With the implementation of new public digital registers (e.g. the Austrian Vital Statistics Registry in 2015), the number of linkable data sets is still growing.

As of 2023, data available in AMDC is based primarily on data from within the statistical production process (source 1, see Section 3). It is expected that second sources – additional administrative data from the federal state – will grow over time, as separate legal acts by the responsible ministries are the precondition for the use of these data. In order to maximize the potential of the AMDC for excellent research and evidence-based scientific policy advice, access to these data must be released by the responsible ministries through FOG regulations. Since the ministries have committed themselves to data access for science, especially in the context of the current crises, the authors assume that the ministries will issue such regulations in a timely manner. Ultimately, this will also further strengthen Austria as a location for science.


Corresponding author: Tobias Thomas, Director General, Statistics Austria, Vienna, Austria; Düsseldorf Institute for Competition Economics (DICE), Heinrich-Heine-University Düsseldorf, Dusseldorf, Germany; and Centre of Media Data and Society (CMDS) of the Central European University (CEU), Budapest, Hungary, E-mail:
The authors are grateful to the editors Peter Winker and Joachim Wagner as well as to Adam Lederer (Berlin) for very useful hints and comments.

References

Abowd, J.M. and Kramarz, F. (1999). The analysis of labor markets using matched employer-employee data. In: Ashenfelter, O. and Card, D. (Eds.), Handbook of labor economics, 1st ed. Vol. 3, Ch. 40, pp. 2629–2710.10.1016/S1573-4463(99)30026-2Search in Google Scholar

Ahmad, N., De Backer, K., and Yoon, Y. (2009). An OECD perspective on microdata access: trends, opportunities and challenges. Stat. J. IAOS 4: 57–63.Search in Google Scholar

Benesch, C., Loretz, S., Stadelmann, D., and Thomas, T. (2019). Media coverage and immigration worries: econometric evidence. J. Econ. Behav. Organ. 160: 52–67, https://doi.org/10.1016/j.jebo.2019.02.011.Search in Google Scholar

Borchsenius, L. (2006). New developments in the Danish system for access to micro data. In: Monographs of official statistics. European Commission, Luxembourg, pp. 13–20.Search in Google Scholar

Goetz, C., Hyatt, H., McEntarfer, E., and Sandusky, E. (2017). The promise and potential of linked employer-employee data for entrepreneurship research. In: Haltiwanger, J., Hurst, E., Miranda, J. and Schoar, A. (Eds.), Measuring entrepreneurial businesses: current knowledge and challenges. University of Chicago Press, Chicago, pp. 433–462.10.7208/chicago/9780226454108.003.0012Search in Google Scholar

Lüthen, H., Schröder, C., Grabka, M., Goebel, J., Mika, T., Brüggmann, D., Ellert, S., and Penz, H. (2022). SOEP-RV: linking German socio-economic panel data to pension records. Jahrb. Natl. Stat. 242: 291–307, https://doi.org/10.1515/jbnst-2021-0020.Search in Google Scholar

Reuter, W.H. and Museux, J.M. (2010). Establishing an infrastructure for remote access to microdata at Eurostat. In: Domingo-Ferrer, J. and Magkos, E. (Eds.), Privacy in statistical databases. PSD 2010, lecture notes in computer science, Vol. 6344. Springer, Berlin, Heidelberg, pp. 249–257.10.1007/978-3-642-15838-4_22Search in Google Scholar

Thomas, T., Heß, M., and Wagner, G.G. (2017). Reluctant to reform? A note on risk loving of politicians and bureaucrats. Rev. Econ. 68: 167–179, https://doi.org/10.1515/roe-2017-0023.Search in Google Scholar

Weinhardt, M., Meyermann, A., Liebig, S., and Schupp, J. (2017). The linked employer–employee study of the socio-economic panel (SOEP-LEE): content, design and research potential. Jahrb. Natl. Stat. 237: 457–467, https://doi.org/10.1515/jbnst-2015-1044.Search in Google Scholar

Received: 2023-06-12
Accepted: 2023-06-12
Published Online: 2023-07-04
Published in Print: 2024-08-27

© 2023 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 17.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jbnst-2023-0043/html
Scroll to top button