WorldScope coverage update 3rd Quarter 2015

WorldScope company records now cover annual reports data for 78553 companies. This includes 46714 active and 31839 inactive companies worldwide. This update: 565 companies were added. Because of these additions the number of active companies in the database has gone up 229 from 46485 in June. This is the fifth update in a row that the total number of active companies went up. WorldScope company records are also available through Datastream and LexisNexis.

Today I have updated the WorldScope country coverage file and it now includes the latest update as it was posted in the third Thomson Reuters Infostream quarterly publication of 2015.

Major updated Countries (new records):
Australia (20)
China (135)
Hong Kong (19)
India (37)
Japan (44)
Taiwan (18)
United Kingdom (30)
United States (116)
Vietnam (29)


Compustat & missing data

When you are using Compustat to download data for a number of companies you will probably get missing data for some companies even though the variables you selected for the output are relatively common items. There are several reasons why this happens and I will discuss some of these here (for active companies).

One of the reasons may be related to the Accounting Standards that companies use for reporting their financial data and also the interpretation of the accounting rules in different countries. Some variables are simply not required and may therefore be included in reported financial statements for some years and not for others. Also, if companies switch accounting standards changes will occur in reported items. If companies use IFRS that usually means that more standard variables are reported and are comparable.

In addition to changes in companies and the way they report, it is also possible that the reason for missing data lies in the way the Compustat databases evolve/change over time. Some variables for instance are legacy variables that no longer contain data or only have data for the older years in the database. In the WRDS platform there is a file called “List of Entirely Null Variables in Compustat Datasets” in the support section at “Vendor Manuals“.
An example of a variable that (no longer) has data in the Compustat North America part database Fundamentals Annual is Audit Fees: RMUM — Auditors’ Remuneraton.

The final reason why data may be missing can be related to the type of company: Financial or Industrial. Financial sector type companies are usually active in the insurance sector, banking sector, etc. If you use screening filters in a search to limit your selection to Financial or Industrial type companies at Search Step 2 in WRDS you may not get (enough) data. This can be particularly tricky if a company files two types of financial statements and can have two records (filings) for each Fiscal Year in Compustat. One of the statements will show IND for Industrial and the other record FS for Financial to indicate what filing type it is.

An example of this type of variance is the variable “Capital Expenditure” (CAPX = This item represents the funds used for additions to property, plant, and equipment, excluding amounts arising from acquisitions (for example, fixed assets of purchased companies). This item includes property & equipment expenditures.). Industrial companies with Industrial type filings will report the variable. Financial type companies often will not. The exception is, of course, a company that files two types of statements.

N.B.: If you are unsure about missing values or if you wish to find out if the data is really not reported (or available) you could try searching for the same list of companies in a second database and download the same variable there.


Small error in S&P 500 Total Return data

Recently a long time series was needed to use in some exercises and in this case the choice was made to use the the default closing Price (PI) and the Total Return (RI) data for the S&P 500 index. The source for the data was Datastream from Thomson Reuters. The price data from Datastream for the S&P 500 index goes back to December 31st of 1963. The Total Return data from Datastream for this index goes back to January  1st of 1988.

While working with the data a strange outlier in the data was discovered for January 24th 1990. Statistically speaking, the calculated return made a huge jump there. See the example here:

The data was then compared to data from Yahoo Finance (using the Quandl add-in to do a quick download to excel). Using the Yahoo Finance data to calculate the Total Return again this appeared to give a comparable result to the manually calculated RI using the Price data for the index from Datastream.
After contacting the Thomson Reuters helpdesk it became clear that there was a small error in their spreadsheet that is used to calculate the RI: for January 24th an incorrect price was used for the RI calculation: 324,17 instead of 330,26.

Unfortunately, Thomson Reuters will not change this small error for their calculated RI. If you need to go back and have to use the data from that time you might use a corrected RI for this day which could be: 354,53. This number is more in line with statistical variance for the return over a period of 25 year or more. See also:

This goes to show that it always pays off to check for outliers and (when in doubt) to contact the vendor/source and discuss possibilities.