Thomson Eikon & Event windows

Not too long ago I was working with the successor of Thomson 1 / Thomson One which is called Eikon or Thomson Eikon. This product is web-based and is similar in look and style as Bloomberg. I found out that in most cases for me the Excel add-in was the most useful part and allowed me to download larger amounts of data.
The specific Add-in did, however, miss a key feature which I use heavily in Datastream: there is no Request Table tool to allow event studies with changing dates and time windows. Using the add-in I was able to figure out how to still do something like this using the commands which Thomson inputs in your Excel worksheet to call up the data.

With a bit of help from the Eikon Helpdesk I was able to come up with a template that kind of does the same thing as a Request Table Excel tool from Datastream. You can download the example here.

Explanation of the worksheet:

0) Before using the worksheet template, make sure that the connection to Eikon on your computer is not yet live. Otherwise the sheet will immediately start updating as the Add-in and Eikon code is usually set up this way.

1) The first sheet is called Events
As the name already suggests, this is where you put the codes for the series (column A). In columns B and C the Start and End dates for the series go. If you need an exact number of trading days (of data) you can use the formulas from a previous post.
If the exact number of days need not be the exact same column D shows the number of calendar days between the start and end days. This is done by deducting the end date from the start date. This works fine as dates are also numbers in Excel. Cell E2 shows the highest number in the entire column D using formula =MAX(D:D)

2) The second sheet is Data
The example sheet is used to download the Eikon data into and is based on generated Eikon code(s).

Cell D1 in sheet Data is linked to Cell E2 from the Events sheet and will show the same value with the formula: =Events!D2
This number is important as this indicates the (maximum) number of rows for each new window to start downloading data without overwriting the data from previous (downloaded) series. This may cause empty rows between series/data, but these empty rows can be deleted afterwards


The formulas in Column C are used to calculate the row/cell numbers for each separate series to start downloading: =((B4-$B$3)*$D$1)

This is also based on the generated number of events in columb B: =B3+1.


The Helper Column is column A. This column is needed to get the download destinations to be included later in the Cells with the Thomson Eikon codes. There is just a technical reason for this, otherwise the Eikon code does not accept the generated destination cells. The formula is: =ADDRESS($C4;6) where the $C4 is the number of rows it takes from column C and 6 indicates the column number where the download should start.


The download code from eikon is as follows: =TR(Events!$A2;”TR.PriceClose”;”Frq=D SDate=#1 EDate=#2 Curn=EUR CH=Fd;date;IN Transpose=Y NULL=NA CODE=ISIN”;INDIRECT($A3);Events!$B2;Events!$C2)

  • This formula takes the series data from the Events tab in column A where it says Events!$A2
  • The start & end dates are included at the end where it says: Events!$B2;Events!$C2
  • The data download cell destinations for the events are indicated where it says: INDIRECT($A3)

You just copy everything downwards (including the Eikon formula) depending on the number of events in the dataset.

3) When you are done copying the events in to the sheet Events and copying the necessary formulas in the Data sheet, you can then go live by signing in into Eikon. You will see that the data download starts automatically.

N.B.: The example download sheet downloads the closing price for series with the Eikon datatype code TR.PriceClose. If you need another datatype this part needs to be changed in the formula.

Email

Stata & large downloads from Datastream

In the past I demonstrated how it was possible to change a dataset from a wide presentation (= data in columns) to a long presentation (= data in rows) using the reshape command in Stata. This option works fine where smaller datasets are concerned and if there is no choice when downloading data. For larger datasets, however, it may become a problem to download the data in wide format and also the Stata software may have difficulty changing the presentation.

If you download, for instance, daily price data from Datastream for many years (using ISIN codes) for x number of listed companies the wide download format (transposed) is not a good option. Try downloading everything without transposing and you get everything in long format as follows:

Changing the data to a proper long format for Stata goes as follows:

1) Make a copy of the original data and add a new row at the top.
2) Above the Dates in column A type in a general name like “company” in Cell A1.
3) Also use numbers for each individual company at the top.
4) Insert a new row above the data and give the name Date in Column A above the actual dates
5) In the cells of this same row use an Excel formula to generate fake names. Example: =$A$1&B1
The copied Sheet should look as follows:

6) Now also make a copy of the first 3 rows and paste the data transposed using the Right-click option in a new sheet. Example:

7) Use the Excel function =left() to get the ISIN codes (12 digits) in a separate column or use Search and replace to get rid of the variable indicator (P)


8) Copy and paste this data as values to a new sheet. It should look as follows:9) Make a new copy of the first changed sheet and delete the first few rows. Make sure to copy & paste as value (right-click) to remove any formulas. The sheet should now look as follows:

Now start up Stata en use the sheets in the Excel Worksheet step by step:

10) First you import the sheet (from step 9) with the price data from Excel using the command import (or the Menu option File > Import > Excel spreadsheet).

11) Use the reshape command the rearrange the data to long format: reshape long company, i(Date) j(price). Change the variable names and make sure that the column with numbers is called company. Next save this as a Stata database.

12) Use clear all to start with the second sheet. Now import the second sheet with the company ISIN codes from step 8.

13) Use the merge command to merge this data with the Stata database we created at step 11). The command would look as follows:
merge 1:m company ” c:\ … \filname.dta”, sheet(…), firstrow


14) now do some data curation steps to change prices to numbers (destring command) and generate a newly formatted date. Make sure to save this file.


As an example I provide here a zip-file with an example download in Excel with the Steps on separate sheets and a .do file that can be run step by step. If you want to use this, be sure to extract both files to the folder c:\temp on your computer. You can also extract them in a different folder. In that case you need to change the locations of the files in the .do file before running the file in Stata.

Important to remember with this example: Stata may still run into problems if you have hundreds of companies (or more) for which you have daily data for many years. If there are problems I recommend converting the download in brackets of 100 to 200 companies and then appending the resulting databases to create a single Stata database.

N.B.: A student pointed an example of this out to me which he found at: Princeton University Library Data and Statistical Services. I do not claim any credit for the original idea but the Stata example provided on this blog was made by me.

Email

WorldScope coverage updates 3rd and 4th Quarter 2016

WorldScope company records now cover annual reports data for 81410 companies. This includes 47560 active and 33850 inactive companies. The last 2 updates: 913 companies were added. Because of the changes the number inactive companies in the database has stayed approximately the same: at the moment 47560. WorldScope company records are also available through Datastream and LexisNexis.

Today I have updated the WorldScope country coverage file and it now includes the latest updates as they were posted in the third and fourth Thomson Reuters Infostream quarterly publication of 2016.

Major updated Countries (new records):
Australia (34)
China (102)
Hong Kong (44)
India (148)
Japan (41)
South Korea (21)
Sweden (28)
Taiwan (23)
United Kingdom (34)
United States (239)

Email

Using Datastream or IBES for forecast accuracy

Not long ago a student contacted me with regard to the subject of forecasts. Forecast accuracy is defined as the absolute difference between the consensus analysts’ forecasts and actual earnings per share divided by the firm share price at the beginning of the quarter. Differences between actual and forecast earnings can be considered “surprise data”.
His question concerned the variables that he got through Datastream & IBES and which variables should be used. The answer I came up with (with the help of the Thomson helpdesk) was the following:

Datastream has direct data types for the Surprise earnings but this is only available for companies that have quarterly data coverage. Important in this case is, that only the current values of these data types are held – no history is maintained. It is also very difficult to calculate the Surprise data manually because of Datstream padding function.
Regardless of whether you decide to use IBES or Datastream data types, you should not use data types from both sources in the same analysis! Only use IBES data types or only use Datastream data types.

To answer the questions on using actual EPS and EPS forecasts and prices the analysis could be done using the following IBES data types:
EPSI1YR = Earnings Per Share End Date of Quarterly Period INT1
EPSI1MN = Earnings Per Share Mean INT1
I0IND = EPS Last Rep Int Period Indicator
I0EPS = EPS Reported Interim EPS (INT1)
IBP = PRICE (IBES)

Using the output from IBES (through Datastream) you have to be careful when comparing EPS data as all data should be for the exact same quarter. See below for an example with colours to indicate the quarter data to be compared:

Email

Bankscope cancelled and replaced

Bureau van Dijk (BvD) publishes the Bankscope database. It was announced recently that Fitch, the main provider of the Bankscope data, has terminated it’s contract with BvD. As a result, the availability of bank data will undergo some important changes, with possibly disruptive consequences.  BvD is currently working on a new Bankscope data product: Orbis Bank Focus. However, the coverage of the new database is limited at the moment: historical coverage in this new database is severely restricted, with up to only three years of data (Bankscope has ten (or more) years of history). BvD are prioritizing cross-sectional coverage, and are working on including recent & historical data for banks in different countries. The new product is obviously a serious restriction for academic research.

The current Bankscope database will remain available until the end of 2016, at which point the contract between Fitch and BvD will end, and BvD will need to take down the data. If this affects your research, I advise you to download data as soon as possible.

Email

Filing size of 10-K reports

No too long ago a student asked me if it was possible to find out what the size was for specific filings of companies with the SEC. Specifically it concerned the 10-K filings which are the annual reports with a comprehensive summary of the companies financial health. As I understand it, the file size of these filings was to be used in an exercise and served as a proxy for financial reporting readability.

To find the original filings it is of course possible to use the EDGAR search option to find the original full-text filings and parts (including the XBRL filings):

Using the Central Index Key (CIK), Ticker, or Company name it is easy to find a specific company. Using the Filing Type search box you can narrow the search down to specific filings, like 10-K. This is often possible going back to 1993/1994 (depends on the company).

A specific 10-K filing overview in Edgar would look as follows:

If you need to do this for several hundreds of companies and multiple years it would take some time to collect the file size data. If you have access to the Audit Analytics database it is possible to get the 10-K file size for a large number of companies at once through the WRDS platform. Audit Analytics has a part database that is called Accelerated Filer.


Through this part database it is possible to get filings data (from 2000 to now) using Ticker lists or CIK lists. In the example output below I have put the following variables:

NAME = Company Name
FORM FKEY = Form name (10-K and 10-Q)
FISCAL YE = Fiscal Year End
FILE SIZE = Filing size
MATCHFY_SUMFEES_FY = Fiscal Year

Example output using Excel to filter for just 10-K filings:

Important: If you look at the 2015 10-K form file size according to Audit Analytics for the company ADVANCE AUTO PARTS INC this is 17 MB. This corresponds (roughly) with the size of Complete submission file according to EDGAR: 17.748.770. The Complete submission file in Edgar includes not just text and html codes, but may also include pictures and any other file types (Excel files etc.).

Email

Total Q and Tobin’s Q

A new Compustat data source has become available to people who have access to Compustat databases through WRDS: Peters and Taylor Total Q. This new source provides data on firms’ “Total q” ratio and the replacement cost of firms’ intangible capital. Total q is an improved Tobin’s q proxy that includes intangible capital in the denominator, i.e., in the replacement cost of firms’ capital. Peters and Taylor estimate the replacement cost of firms’ intangible capital by accumulating past investments in R&D (Research and Development) and SG&A (Selling, General and Administrative Expenses). Background paper can be downloaded here.

Overview of the variables and names in the database:

datadate = Date
fyear = Fiscal Year
gvkey = GVKEY / Compustat unique company code
K_int = Firm’s intangible capital estimated replacement cost
K_int_Know = Firm’s knowledge capital replacement cost
K_int_offBS = Portion of K_int that doesn’t appear on firm’s balance sheet
K_int_Org = Firm’s organization capital replacement cost
q_tot = Total q


The database offers mainly data for companies which are included in the Compustat North America database. Most of these companies are American. The coverage is: 1950 – now.

Email