Startseite Wirtschaftswissenschaften Combined International Trade and Investment Data “CITID” – A New Innovative and Comprehensive Data Landscape for Germany
Artikel Open Access

Combined International Trade and Investment Data “CITID” – A New Innovative and Comprehensive Data Landscape for Germany

  • Dominik Boddin EMAIL logo , Hendrik W. Kruse , Hariolf Merkle , Susanne Walter und Benedikt Zapf
Veröffentlicht/Copyright: 19. März 2024

Abstract

The Combined International Trade and Investment Data (CITID) represents a new innovative and comprehensive data landscape that combines firm-level data focused on international trade and investment. The datasets are sourced from both the Deutsche Bundesbank and the Federal Statistical Office of Germany, ensuring a reliable and robust data landscape. Within the CITID, multiple individual datasets seamlessly merge, encompassing a wide range of information. These include data on trade in goods, trade in services, inward and outward foreign direct investment (FDI), as well as information on international financial and capital transactions along with additional firm-level data. The data enables a holistic understanding of firms’ international integration, previously unattainable, and offers new opportunities for analyzing global trade dynamics. The data will be accessible for research. This paper provides a comprehensive description of the individual datasets, including information about their combinability, the methodology for matching them, the data’s content, and its potential for further research.

JEL Classification: C80; P45; F14

1 Introduction

In today’s interconnected world, the increasing global integration of firms holds significant implications for the domestic economy. As nations become increasingly interconnected through trade and investment, understanding these implications on domestic economies becomes paramount. Foreign shocks have the ability to rapidly propagate through trade channels, such as supply chains, investments, and capital flows, swiftly crossing borders and causing potential issues that require informed policy decisions.[1]

Moreover, as our world becomes progressively more digitalized and intangible products and services assume a pivotal role, the analysis of trade activities might require a simultaneous observation of trade in goods and services. The Sturgeon Report (2013) has raised awareness for the importance of cohesive statistical data sources to effectively monitor the interrelated economies through global value chains.

Therefore, the presence of a comprehensive database that accurately depicts the international integration of firms becomes all the more crucial. However, existing data sources often focus on only one aspect of international connections. In the course of a recent project funded by the German Federal Ministry of Economic Affairs and Climate Action, the collaboration between the Deutsche Bundesbank and the Federal Statistical Office Germany has bridged this gap by combining data from both institutes for research purposes for the very first time.

This allows researchers to jointly examine data pertaining to multiple dimensions of international integration. The data encompasses various facets such as trade in goods, trade in services, inward and outward foreign direct investment (FDI), and international financial and capital transactions, complemented by additional firm-level data. In the context of this article, we will label this new data landscape as the Combined International Trade and Investment Data (CITID).

Researchers will be able to access the anonymized data at the Research Data Centre’s (RDC) guest researcher workstations in a secure environment at selected RDC’s locations.[2] The availability of CITID opens up new avenues for researchers to understand the patterns, drivers, and consequences of international integration. They can now, for instance, delve into the intricate interplay between FDI, trade flows of goods and services, capital transfers, and other critical aspects of globalization. CITID helps overcoming to rely on fragmented datasets that may only cover one aspect of international exposure, may make it challenging to develop robust theories and models to explain the multifaceted nature of international integration.

The remainder of this paper is organized as follows. Section 2 briefly presents the content of the new data landscape, i.e. the individual datasets that can be combined. Section 3 introduces the underlying methodology, in particular the matching procedure. Section 4 describes the data access and the opportunities for research. Section 5 offers some concluding remarks.

2 Data Methodology

This section provides a concise introduction to the individual datasets that can be integrated within the new data landscape. In the past, these datasets have already proven useful for various research purposes, including the field of international trade.[3] However, until recently, it was only possible to link the datasets with other data offered by the same institution (either Deutsche Bundesbank or Federal Statistical Office of Germany). For the first time, the new CITID allows researchers to combine data from the Deutsche Bundesbank and Destatis. Currently (as of January 2024) the data that can be integrated within the new data landscape share a timeframe spanning from 2011 to 2020. It is planned to update the data annually. For more detail, please also refer to the individual data or meta data reports (see, e.g. Kruse et al. (2023)).

2.1 Deutsche Bundesbank Data

2.1.1 Statistics on International Trade in Services (SITS)

The Statistics on International Trade in Services (SITS) contains all service transactions with foreign countries whose value exceeds 12,500 euros or the equivalent in another currency.[4] Domestic firms, banks, private individuals and public authorities are legally obliged to report to the Deutsche Bundesbank in order to compile the balance of payments statistics in Germany.[5] Each observation in the SITS corresponds to a reported service transaction but, as before, we aggregate the information at the firm level and obtain an unbalanced panel with monthly frequency. The data provide detailed information about the service transaction such as transaction value, type of exported and imported services (for instance, transport, research and development), partner country and sector classification of the domestic firm.

2.1.2 Microdatabase Direct Investment (MiDi)

The Microdatabase Direct Investment (MiDi) is a dataset that provides insights into foreign direct investment (FDI) stocks, leveraging official administrative German FDI microdata.[6] The dataset encompasses both outward and inward FDI activities involving German firms. While private individuals are also included in the data, the lion’s share can be attributed to firms and banks. German-based investors have a legal obligation to report all investments made abroad if the foreign subsidiary’s total assets exceed 3 million euros, and if the investor holds a minimum of 10 % shares or voting rights. Conversely, domestic firms are required to report foreign investments if the total assets of the German subsidiary exceed 3 million euros, and the investor holds at least 10 % of the shares or voting rights. The original FDI microdata is derived from annual cross-border shareholding reports collected by the Deutsche Bundesbank to compile FDI inventory statistics for Germany.

2.1.3 Statistics on International Financial and Capital Transactions (SIFCT)

To compile the German balance of payments statistics encompassing the financial account, capital account, and investment income, German residents have a legal obligation to report capital and financial transactions, as well as capital income, exceeding 12,500 euros or its equivalent in another currency. This reporting requirement applies to both income received from abroad and expenditure made abroad.[7] Each observation in the SIFCT dataset corresponds to the total income from a specific type of capital or financial transaction for a resident towards a particular country within a given month, or to the total expenditure. The data can be aggregated at the firm (time) level, resulting in an unbalanced panel with monthly frequency.[8] The dataset provides comprehensive information about the receivables and liabilities between residents and foreigners, including the acquisition and disposal of non-produced, non-financial assets, as well as transactions involving financial assets, liabilities, and investment income. Additionally, the data set includes details about the counter-party country and the industry classification of the resident company, further enriching its content.

2.2 Data from the Federal Statistical Office and the Statistical Offices of the Länder

2.2.1 Statistical Business Register (SBR)

The Statistical Business Register (SBR, in German “Unternehmensregister System”) contains essential information about firms based in Germany. It includes details such as the industry, employment, and revenue of these firms. The SBR serves as the survey frame for various statistics and includes different unique identification numbers that facilitate the data. The SBR is updated annually.[9]

2.2.2 AFiD-Panel International Trade in Goods Statistics (AHS-Panel)

The AFiD-Panel International Trade in Goods Statistics (AHS-Panel) is based on microdata from International Trade in Goods Statistics. It contains monthly data on the value and quantity of imports and exports of goods of German firms, categorized according to goods categories (according to the Goods Classification for International Trade in Goods Statistics – WA), partner country, and various other characteristics. The AHS-Panel is an unbalanced panel that currently (as of January 2024) covers reference years 2011–2020. It is important to note that the AHS-Panel only includes units that can be linked to the SBR. This implies that foreign units are not included in the AHS-Panel. They may, however, be obliged to report to International Trade in Goods Statistics. As a result, the total trade value covered in the AHS-Panel differs from the publications of International Trade in Goods Statistics. The Federal Statistical Office of Germany plans to update the AHS-Panel annually.

The data included in the AHS-Panel has undergone additional compilation steps to allocate data reported by VAT-groups to the individual legal units. It includes estimations for firms below the exemption thresholds in intra-EU trade. Intra-EU exemption thresholds are chosen to ensure that 93 % of the import value of Germany and 97 % of the export value are directly reported. Due to the concentration of trading activities, this implies that data is recorded for only roughly 10 % of firms that trade within the EU. Data for the remaining 90 % of firms that trade within the EU and fall below the exemption threshold is estimated. For extra-EU trade, import and export values are directly reported and no estimation is needed. Kruse, Meyerhoff, and Erbe (2021), in German describe the methods applied to overcome these issues. Estimated data for firms below the exemption thresholds is not available at the same level of detail as directly recorded data. Estimations are differentiated by direction of trade and, whenever possible, partner country.[10] For the purposes of the analyses in this paper, the data from the AHS-Panel have been aggregated at the annual level.

2.2.3 AFiD-Panel Structural Business Statistics (SBS-Panel)

The AFiD-Panel Structural Business Statistics (SBS-Panel) combines both samples and complete surveys depending on the individual Structural Business Statistics conducted in the respective industry. The SBS-Panel is an unbalanced panel that contains a range of information including details on gross value added, employment figures, wages and salaries, and investments. However, it is important to note that the SBS-Panel does not encompass all industries. Specially, it focuses solely on the non-financial commercial economy.[11] It includes information on aggregate export and imports values from the AHS-Panel. Currently the SBS-Panel covers the reference years 2008–2020. The Federal Statistical Office of Germany plans to update the SBS-Panel annually.

3 Data Linkage

As mentioned earlier, the individual data mentioned above can be matched. The primary unit of observation is at the year-firm level, although some data is available at smaller time intervals (e.g. monthly frequency for SITS). It is worth noting that there may be multiple observations at the firm-year level when the underlying data allows further differentiation. For example, the AHS-Panel data offers product-level trade information for each firm in a specific month, categorized by destination and other variables. Including an additional destination or product would consequently result in an extra observation at the firm-time level. However, it is possible (but not mandatory) to aggregate all observations from all datasets at the firm-year level to establish a common foundation. Although the individual datasets may cover different time periods, they currently (as of January 2024) all share a timeframe spanning from 2011 to 2020.[12] It is planned to update the data annually.

3.1 Linking at the Year-Firm-Level

The Record Linkage Unit at the Deutsche Bundesbank’s Research Data and Service Center (RDSC) creates mapping tables for matching purposes using string matching and machine learning techniques. This mapping is necessary because the firm identifier (“ID”) used in the Deutsche Bundesbank’s data, known as the AWMUS-ID, differs from the SBR-ID used in Destatis data. The matching process generally yields good quality results, thanks to the utilization of state-of-the-art matching techniques. However, it is important to acknowledge that not all entities can be matched due to various reasons, such as differences in entity definitions for certain firms.[13]

The cross table, Table 1, provides an overview of the linkage quotas between the individual data sets provided by the Deutsche Bundesbank and the Federal Statistical Office of Germany. Each cell represents the proportion of firms found in another dataset. The row labels correspond to the baseline data, which serve as a reference for the overlapping shares displayed in the columns. To illustrate, 1.1 % of the firms in the AHS data are also present in SITS, while 56.1 % of the firms in SITS can be found in the AHS data. These figures might indicate that most firms engaged in product trade do not participate in trade in services, and vice versa, most firms engaged in trade in services are also involved in trade in goods.

Table 1:

Linkage quotas BBK Destatis data.

Dataset SBR compl. SBR rel. AFiD-AHS AFiD-SBS MiDi SIFCT SITS
AFiD-AHS 100.0 95.8 16.7 1.1 1.0 2.4
AFiD-SBS 100.0 98.1 35.5 1.9 1.5 3.6
MiDi 85.2 71.5 54.4 47.0 42.1 42.5
SIFCT 68.8 56.8 37.6 26.3 31.1 42.7
SITS 82.8 77.2 55.9 39.4 19.4 27.3
  1. The SBR considers such units as “analytically relevant” that surpass the current relevance thresholds of the SBR and, as such, are relevant for the calculation of the Gross Domestic Product (GDP).

    [Correction Statement added after online publication 19th March 2024: In table 1, under the first row “Dataset” two captions “SITS” and “MiDi” were interconverted in line 3 and 5.]

Deviations from 100 % suggest that either not all matches were found during the record linkage process (e.g. as firm’s entity definitions might differ) or, more commonly, that firms are absent from a specific dataset due to non-reporting obligations. The SBR dataset encompasses all firms located in Germany with a taxable turnover exceeding 17,500 euros or with at least one employee subject to social security contributions (or at least 12 persons in minor employment).[14] Consequently, the matching quotas in the first columns can be viewed as an indicator of matching quality, as one would expect every firm present in a particular dataset to also be part of the universe of firms. However, there are other possible reasons for not meeting this threshold.

Considering that each dataset consists of official administrative data with reporting requirements, the resulting data landscape offers a comprehensive representation of international integration in various aspects. Furthermore, it is possible, under specific conditions, to link the data with external firm data, such as data from commercial sources. Past examples of external firm data include data from “Bureau van Dijk” (e.g. ORBIS, AMADEUS), “Hoppenstedt” or “Orbis” patent data.

Figure 1 provides additional information on the number of observations available for analysis when considering different combinations of firm datasets. This visualization, based on data from the year 2020, provides a comprehensive overview of the total number of firms within these datasets. The areas where datasets overlap visually represent firms that are present in multiple datasets simultaneously.

Figure 1: 
Overlap of matchable firms in CITID. This figure shows the distribution of firms across the datasets MiDi, SITS, SIFCT, SBS and AHS that can be potentially matched with each other.
Figure 1:

Overlap of matchable firms in CITID. This figure shows the distribution of firms across the datasets MiDi, SITS, SIFCT, SBS and AHS that can be potentially matched with each other.

The central region of the figure, where all datasets overlap, contains a total of 1976 firms. To determine the total number of observations within a single dataset, you can sum the individual areas corresponding to each dataset’s unique color. For example, the SIFCT Data comprises a total of 13,075 observations, derived by adding up the individual areas as follows: 2309 observations exclusive to SIFCT, 1263 observations found in both SIFCT and SITS but not in any other dataset, and so on.

3.2 Linking by Direction and Partner Country

With the exception of the SBS-Panel, all datasets discussed above include additional dimensions that could in principle also be used to link the data. MiDi, SITS, SIFCT and the AHS-Panel all include information at the firm-level differentiated further by direction of trade and partner country. This implies that the datasets can in principle also be linked at the firm-direction-country-level. However, there are important differences regarding the interpretation of these variables in the different datasets that have to be considered to obtain valid conclusions. First, while SITS, SIFCT and AHS-Panel record flow variables, MiDi includes stocks. Second, the difference between fiscal and physical flows complicates the joint analysis of SITS, SIFCT and the AHS-Panel.

While SITS and SIFCT record payments, i.e. fiscal flows, the AHS-Panel records movements of goods, i.e. physical flows. For most movements of goods there is a corresponding payment, i.e. a fiscal flow in the opposite direction. For instance, if a German firm sells a good to a French firm and the good is moved physically from Germany to France, the French firm has to pay the German supplier the value of the good in exchange. However, the value of the good recorded in the AHS-Panel does not always correspond to an actual payment. First, the value of the goods recorded in the AHS-Panel includes freight and insurance costs up to the German border.[15] Whether freight and insurance are actually part of the payment, however, depends on the specificities of the contract between buyer and seller. Second, there is not always a corresponding payment for every movement of goods. Not every movement of goods is due to a sale or purchase. Some goods are delivered free of charge and some goods cross the border without any change of ownership (for instance for storage). In both cases, no actual payment is due, but the movement of goods will still be recorded in the AHS-Panel and valued at the goods’ market value.

Moreover, business relations can involve more than two firms to the effect that the movement of goods may occur between different partner countries than the corresponding payment. For example, in a setting of triangular trade an Austrian firm may sell a good to a Swiss firm without having this good in storage. Instead the Austrian firm may itself purchase the good from a German firm which delivers it directly to the Swiss firm. In this case, there is a physical movement of goods between Germany and Switzerland. But no payments take place between Germany and Switzerland. Instead, there is a fiscal flow from Austria to Germany and from Switzerland to Austria.

4 Data Access and Use for Research

Access to CITID data will be granted in secure environments through designated Destatis RDC guest researcher workstations or at the Deutsche Bundesbank’s RDSC workstation.[16] This access method is necessary to comply with legal requirements regarding the confidentiality of statistical reports while facilitating independent academic research with individual-level data. To obtain access, researchers are required to submit a research data request along with a research proposal. The feasibility of the research project, considering the research data’s suitability for addressing the proposed research questions, is carefully evaluated. Research projects must serve the public interest, and commercial endeavors are not accepted.

The new data landscape brings novelty by offering a comprehensive view of various aspects of international integration simultaneously. Previously, researchers and analysts often had to rely on fragmented datasets that only covered specific facets of international exposure, which might have posed challenges in developing robust theories and models to explain the multifaceted nature of international integration. For instance, when examining solely trade in goods data, one might assume that a shock from a foreign country would not affect domestic firms if there were no import or export relationships with that country. However, firms could still be integrated through means such as FDI, capital transfers, or trade in services. Having a complete understanding of international integration allows for more in-depth analysis of its effects. Kruse et al. (2023) illustrate the potential of the new data landscape in more detail. By connecting the AHS-Panel with MiDi data, the authors provide one such example by exploring the role of foreign direct investment firms and foreign direct investors in the trade of goods.

5 Concluding Remarks

This paper introduces the CITID, a novel and expansive data landscape that integrates firm-level data specifically related to international trade and investment. The CITID provides a comprehensive portrayal of firms’ international integration across various aspects. Researchers have access to this new landscape, offering exciting opportunities to explore and gain deeper insights into the patterns, drivers, and consequences of international integration.


Corresponding author: Dominik Boddin, Deutsche Bundesbank, Wilhelm-Epstein-Strasse 14, 60431 Frankfurt, Germany, E-mail:
Any opinions expressed in this paper represent the author’s personal opinions and do not necessarily reflect the views of both Deutsche Bundesbank and Federal Statistical Office or their staff. The results presented in this paper were developed within a joint project including the Kiel Institute for the World Economy (IfW), the Institut für Angewandte Wirtschaftsforschung (IAW) at the University of Tübingen, the Federal Statistical Office of Germany and the Deutsche Bundesbank. The Federal Statistical Office and the Statistical Offices of the Länder acknowledge generous funding by the German Federal Ministry for the Economy and Climate Action. We thank the German Federal Ministry for the Economy and Climate Action and our project partners from the IfW and IAW.

References

Baldwin, R., and R. Freeman. 2022. “Risks and Global Supply Chains: What We Know and What We Need to Know.” Annual Review of Economics 14 (1): 153–80.10.1146/annurev-economics-051420-113737Suche in Google Scholar

Biermann, M., and K. Huber. 2023. “Tracing the International Transmission of a Crisis through Multinational Firms.” Journal of Finance. https://doi.org/10.2139/ssrn.4405156.Suche in Google Scholar

Biewen, E., and H. Stahl. 2021. Statistics on International Financial and Capital Transactions (SIFCT). Data Report 2021–06 – Metadata Version 2. Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Biewen, E., and A. Meinusch. 2021. Statistics on International Trade in Services (SITS) 01/2001 – 04/2021. Data Report 2021–14, Metadata Version 5. Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Blank, S., A. Lipponer, C.-J. Schild, and D. Scholz. 2020. “Micro-database Direct Investment (MiDi) – A Full Survey of German Inward and Outward Investment.” German Economic Review 21 (3): 273–311. https://doi.org/10.1515/ger-2019-0123.Suche in Google Scholar

Boehm, C. E., A. Flaaen, and N. Pandalai-Nayar. 2019. “Input Linkages and the Transmission of Shocks: Firm-Level Evidence from the 2011 Tohoku Earthquake.” The Review of Economics and Statistics 101 (1): 60–75. https://doi.org/10.1162/rest_a_00750.Suche in Google Scholar

Doll, H., E. Gábor-Tóth, and C.-J. Schild. 2021. Linking Deutsche Bundesbank Company Data. Technical Report 2021–05, Version v2021-2-6. Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Eppinger, P. 2019. “Service Offshoring and Firm Employment.” Journal of International Economics 117: 209–28. https://doi.org/10.1016/j.jinteco.2019.01.007.Suche in Google Scholar

Fauth, M., B. Jung, and W. Kohler. 2023. “German Firms in International Trade: Evidence from Recent Microdata.” Jahrbucher für Nationalokonomie und Statistik 234 (3–4): 199–284. https://doi.org/10.1515/jbnst-2022-0040.Suche in Google Scholar

FDZ (Forschungsdatenzentren der Statistischen Ämter des Bundes und der Länder). 2023a. Metadatenreport. Teil I: Allgemeine und methodische Informationen zum AFiD-Panel Außenhandelsstatistik (AHS-Panel), Berichtsjahre 2011–2019. Version 1. Wiesbaden 2023a. [Zugriff am 24. August 2023]. Verfügbar unter: www.forschungsdatenzentrum.de.Suche in Google Scholar

FDZ (Forschungsdatenzentren der Statistischen Ämter des Bundes und der Länder). 2023b. Metadatenreport. Teil II: Produktspezifische Informationen zur Nutzung des AFiD-Panels Außenhandelsstatistik (AHS-Panel) 2011–2019 am Gastwissenschaftsarbeitsplatz sowie per kontrollierter Datenfernverarbeitung. Version 2. Wiesbaden 2023b. [Zugriff am 24. August 2023]. Verfügbar unter: www.forschungsdatenzentrum.de.Suche in Google Scholar

Friedrich, K., L. Pham-Dao, C.-J. Schild, D. Scholz, and J. Schumacher. 2021. “Microdatabase Direct Investment - Data Report 2021-23.” Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Gábor-Tóth, E., and C.-J. Schild. 2021. Understanding Overlaps between Different Company Data. Technical Report 2021-06, Version v2021-2-6. Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Gábor-Tóth, E., C.-J. Schild, and S. Walter. 2023. Linking Deutsche Bundesbank Data. Technical Report 2023-05. Deutsche Bundesbank, Research Data and Service Centre.Suche in Google Scholar

Görg, H., A. Jacobs, and S. Meuchelböck. 2023. Who is to Suffer? Quantifying the Impact of Sanctions on German Firms. IZA Discussion Paper No. 16146.10.2139/ssrn.4456331Suche in Google Scholar

Gumpert, A., J. Hines, and M. Schnitzer. 2016. “Multinational firms and tax havens.” Review of Economics and Statistics 98 (4): 713–27.10.1162/REST_a_00591Suche in Google Scholar

Kruse, H. W., F. Hieber, H. Limberg, B. Zapf, and D. Boddin. 2023. Außenhandelsaktive Unternehmen: Neue Analysemöglichkeiten durch Mikrodatenverknüpfung. WISTA Wirtschaft und Statistik Ausgabe 5/2023.Suche in Google Scholar

Kruse, H. W., A. Meyerhoff, and A. Erbe. 2021. Neue Methoden zur Mikrodatenverknüpfung von Außenhandels- und Unternehmensstatistiken. WISTA Wirtschaft und Statistik Ausgabe 5/2021.Suche in Google Scholar

Statistisches Bundesamt. 2022. Statistisches Unternehmensregister. Qualitätsbericht 2021. Wiesbaden.Suche in Google Scholar

Sturgeon, T. J. 2013. Global Value Chains and Economic Globalisation: towards a New Measurement Framework. MIT Industrial Performance Center report to Eurostat. http://ec.europa.eu/eurostat/documents/54610/4463793/Sturgeon-report-Eurostat.Suche in Google Scholar

Tintelnot, F. 2017. “Global Production with Export Platforms.” The Quarterly Journal of Economics 132 (1): 157–209. https://doi.org/10.1093/qje/qjw037.Suche in Google Scholar

Received: 2024-01-31
Accepted: 2024-02-01
Published Online: 2024-03-19
Published in Print: 2025-06-26

© 2024 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Heruntergeladen am 10.1.2026 von https://www.degruyterbrill.com/document/doi/10.1515/jbnst-2024-0024/html
Button zum nach oben scrollen