Using plain text files in Excel

If you use databases to collect data on companies or on anything else, the data files you can save as output are not always nice Excel or SPSS files with all the data in tact. One of the reasons can be the database itself (or the interface): it may change the data to fit the destination (program). Some of the data can, therefore, be reformatted. Excel also sometimes formats data that it presents on your screen.
It is usually a good idea to (if possible) also save the data as a plain old text file. Most of these files can be opened just fine using the Excel Text wizard. If the data is not separated with characters you may need to tell Excel where the columns start. That will only work for text files where the data columns are saved with a fixth width (= fixed number of characters between each column).

1) Most often, the data columns do have characters (or tabs) that separate them. The Text Wizard in Excel allows you to indicate what type of delimiter is needed. When you open a Text file with delimiters in Excel it may look as follows:

2) In the above example you can already see that the caret character ^ is probably what separates the columns of data. When you click the button Next the screen will show some standard Delimiters. The standard option that occurs a lot is the Tab-delimiter. In this case we can use the Other option to type in the caret character ^. Immediately when you do this the Data Preview will show what happens: if all the columns are nicely separated lines will be shown to indicate how Excel thinks the data should be separated into the columns:

3) As a final step you can tell Excel how to open the data in each column. If you leave everything on the Default option General Excel will try to figure out how everything should be presented. If you want to skip columns this is also possible. In my view it is always a good idea to select all the columns and open everything initially as Text. This way Excel will not use any formating whatsoever and all the data is presented exactly as you got it from the database. In any case, it is always a good idea to select the Text option for those columns that list company identifiers. If you do not, Excel may change these and make it impossible to use them later for additional searches or to combine the data later with another dataset. Example screen:

After selecting the import options for all the columns you can click the Finish button to import the text file. The result should look similar to this:

Not all text files have one (combination of) character(s) that separetes the text in columns. The example file I used from SDC Platinum also has a column where the data is separated using the tilde character: ~. See below:

It is also possible to use the Excel Text Wizard on a Text file after opening it:

  • The option is available in Excel using the tab at the top (in the menu) called Data (or: Gegevens). When you have done this the option that are presented in Excel change:
  • Next select the column for which you want to separate the data
  • Now select the option Text to Columns in the Excel menu. The text Wizrda should pop up:
  • In the example we see that the text has the delimiter character, called the Tilde: ~. If necessary, mark the option Delimited and click the button Next
  • A new screen appears where you can indicate what delimiter is needed. Use the Other option to indicate/type the Tilde character. Click the Next button to continue.
  • You can now also choose to got to the Next screen and determine how Excel will format the data. In this example I click the finish button. A warning may appear when there are already columns to the right. This warning says: “Do you want to replace the Contents of the destination cells?“. When I Click the yes button the result looks as follows:

NB: changing data with the Text Wizard in Excel can be dangerous. The original data will be changed and the result will not always be good. Some data may be lost. It is always a good idea to use a copy of the original file and use this before starting to rework the data.


%d bloggers like this: