Matching Compustat & Datastream data in Excel

Although it is recommended to use the same source for as much of your data as possible, there is not always an alternative. Compustat North America, for instance, allows you to downoad anual report data through the Fundamentals Annual database together with stock prices at the end of the calendar or fiscal year. In this case there is (often) no need to go to another source.
In the case of Compustat Global the Fundamentals Annual part database does not have these two Price variables. In this case you can choose to match data from Compustat with data from Datastream. In the example I use here I used a list of ISIN codes of current AEX index companies from datastream.

N.B.: Important: annual report data is also available through Worldscope in Datastream. In this example I assumed that the starting point for a selection would be Compustat Global.

For the example I created an example spreadsheet where you can see how I combined the data step by step. You can download the example Excel file here.

1) First I used the list of equities to download annual report data from Compustat for the period 2000 to 2012. Example Screenshot (WRDS tab in the Excel file):

2) Next I used Datastream to download the End of the calendar year price. First I uploaded the list using the option in the Excel Datastream add-in: Create List(from range):

Next I did a time series search using the uploaded list (use the List picker button below the orange button Find Series). I chose to download the data Transposed which makes it easy to use the download and making the combination. Transpose = yearly data in the columns. This way I can also choose to download the codes that I uploaded and get them in my output. See screenshot:

The output looks as you see it in the Excel spreadsheet tab Datastream:

3) To match the data from Compustat with the Datastream data we need to combine the ISIN codes with the year to make it unique. First we need to use the LEFT() function in Excel (in Dutch = LINKS() )to create a new column with only the ISIN code. See the example screenshot (= This is tab Stap 1 in the Excel spreadsheet):

4) The problem with the Datastream output is also that the ISIN codes are returned but not for each column of data. We therefore need to copy the data and make sure that the codes and annual columns are copied below each other. For each year we also need to add the year to be able to later generate a unique ISIN+year combination. See the example (= This is tab Stap 2 in the Excel spreadsheet):

5) To create unique ISIN+Year ID codes we can use the CONCATENATE() Excel function (in Dutch = Tekst.Samenvoegen). See the previous example Datastream screenshot (= In Excel see column C in the tab Stap 2 for Datastream and Column A in tab Stap 3 for WRDS) and the WRDS example here:

6) As the final step we can now use the data from Datastream and add it to the WRDS spreadsheet using the VLOOKUP() function (in Dutch = Vertikaal.Zoeken() ). The full formula looks like this: =VLOOKUP(A2,’Stap 2′!$C:$D,2,FALSE). If I break it down it looks as follows:

– Function start: =VLOOKUP(
– Look up content of Cell: A2
– In Sheet Stap 2 look for value in cell range columns C and D: ‘Stap 2’!$C:$D
– If there is a match show the value in the second selected column (= column D): 2
– Only return a value if the match is exact: FALSE.
– Function close: )

A) The elements in the VLOOKUP function are divided by comma’s. In the Dutch language version of Excel those would be semi-colons ( ; ).
B) In the example above I used Dollar signs to fixate the cell table range: $C:$D. It is not always necessary to do this in a VLOOKUP function. They also do not automatically appear when you have selected the table-range. You can automatically add them in the formula using function key F4 or by typing them in.

In the tab Stap 4 of the Excel spreadsheet you see how this works:


%d bloggers like this: