Hedge Funds & Datastream

A short time ago I was asked about databases that have information on Hedge Funds (HF). HF are investment vehicles that use pooled funds to use investment strategies that generate high returns. Some of these funds are very aggressive and have a bad reputation. HF are not regulated in the same way as Mutual Funds which also use pooled funds to invest and generate returns. Mutual Funds (MF) are much more regulated and can be sold to the general Public. More information is available in text books and Investopedia & Wikipedia.

I know of only two databases that have information specifically on individual HF type organizations: Datastream and the CRSP Mutual Funds database. In both cases it possible to search on individual HF companies by name or code(s). In the case of Datastream HF and MF can be found in the Data Category Unit Trusts. Through the Criteria Search option (at the top left corner of the Navigator screen) you can find them:

The search options are limited but the most important options are name searches, and Market searches. In addition to this you can also do a quick search on the keyword “Hedge Fund” and this will also offer a filter option (on the left side of the screen) for Unit Trusts.

Another interesting option are the Constituent lists of Hedge Funds. These lists provide an easy way to get some country lists.

If you have a list of Lipper codes (ID codes) you can also upload a list of these in Datastream using the Excel option “Create list from range“. Lipper codes are created by Lipper, Inc. This company (nowadays) is a subsidiary of Thomson Reuters and provides provides mutual and hedge fund data.

LPID – Lipper Identification Code (= Lipper PermID)
Definition: Unique fund identifier code that Lipper assigns to each Share Class. The identifier code is form of 8 numeric’s which Datastream prefix with ‘LP’. Example LP40004324

When you have created your own selection of HF companies (using Market lists or manually) you can do a Static search to find out more about the type of HF and Focus:

Examples of available Datastream Data Types are:

ATYPE – Asset Type
The Lipper Asset Type item describes the primary asset of the fund, for example; Bond, Commodity, Equity, Mixed Asset, Money Market, Real Estate etc. As a general rule a fund will be placed according to its primary investment objective as stated by the promoter and given by the fund documents.

GFOCUS – Geographical Focus
The Lipper Geographical Focus item describes the primary countries or regions where the fund invests. As a general rule a fund must hold at least 50% of its assets in a country/region to be assigned the relevant attribute.

LGC – Lipper Global Classification
The Lipper Global Classification is a granular description of the fund investment strategy, built out from the Asset Type and Geographical Focus items to create homogeneous groups of funds with comparable investment objectives.

N.B.: More Background information on Lipper and the Classification Methodology are available on Wikipedia and the Lipper website.


WorldScope coverage update 1st Quarter 2015

WorldScope company records now cover annual reports data for 77479 companies. This includes 46267 active and 31212 inactive companies. This update: 484 companies were added. Because of these additions the number of active companies in the database has gone up 99 from 46168 in December. This is the third update the past year  that the total number of active companies went up. WorldScope company records are also available through Datastream and LexisNexis.

Today I have updated the WorldScope country coverage file and it now includes the latest update as it was posted in the first Thomson Reuters Infostream quarterly publication of 2015.

Major updated Countries (new records):
Australia (28)
China (36)
Hong Kong (27)
Japan (14)
South Korea (40)
Taiwan (49)
Thailand (15)
United Kingdom (18)
United States (116)


Restoring unique codes

In the past I blogged about the fact that if you work with codes like SEDOL (or any other types), to get information on listed companies, it is important to keep them in tact. Excel, for instance, has a habit of removing zeros when they are the first digits of a code.
Some databases (like SDC Platinum) will not give you a good search result if codes are incomplete when you search using a list of these abbreviated codes. Some databases are more forgiving when you search on codes and the zeros in front are gone.
The second, more important reason, however, to keep the codes in tact (including zero digits in front) is, that when you combine datasets, you will probably use codes to match them. Example:

For whatever reason, when you use Excel, you may have inadvertently have codes in a dataset that are no longer in tact. In the earlier blog post I solved this problem using a nested IF() function for Sedol codes. This example could be adapted for longer codes.
Recently, a student of the Vrije Universiteit, pointed out a more elegant solution: using the REPT() and LEN() functions. The advantage of this solution is, that it can very easily be adapted for codes of a different length. The solution function is:


Explanation: The function will repeat/add the character Zero (= 0) to the characters in a Cell and the number of zeroes depends on the X minus the Length (= len() ) of the number of characters already in the Cell. Added to this will be the characters of the original Cell.

Example for Sedol codes (always 7 digits long):

If the abbreviated Sedol code is: 68G78
The Function would be (Sedol code is 7 digits long): = REPT(“0”;7-LEN(Cell))&Cell
The result would be: 0068G78

Below you see an example for abbreviated 9-digit CUSIP codes:



New Tobin’s Q formulas

In the past I reported on a formula that could be used to download the Tobin’s Q ratio for individual companies. Recently, I discovered that two new formulas have been created by Thomson Financial for Datastream which should provide a better solution to get this ratio. These formulas can be found using the search option to find formulas and available functions in Datastream: the Expression Builder or Expression Picker. The button is available below the orange Data Type button:

Using the button you can search for many formulas:

The Expression builder button allows you to search for Version 2 (created in 2015) using prefrabicated Datatype code 168E which represents the formula:


Running the full expression like you would the previous formula will not work. You will get the error:

The only way to use the formula is by using the mnemonic as the Data Type: 168E

The second new (simpler) version of the Tobin’s Q formula is also available there: X(DWEV)/X(DWTA)*1.000

This version can be used as you would regular (smaller) formulas at the box for Data Types. Example:

An explanation of the different part codes can be found in the Datastream search menu for Data Types. You can find an overview of Data type definitions for the items in the formulas also here.