A quick look at importing data into Excel. Excel has sooo many ways to import information into a worksheet. Go to Data | Get External Data to see the selection.
There are options to grab information from other databases, cloud services, web page tables etc in a variety of ways. The import can be ‘once only’ to copy into Excel or ‘Live’ where Excel regularly checks the source for updates.
Despite all those choices (and hype about them from Microsoft), still the most common way to get information into Excel is the humble and long standing Comma Delimited Values files (.csv). It’s simply a plain text file. Each line is a new row/record with a comma to separate each value/cell from the next. Here’s a short .csv that we made:
.CSV files are common because they are easy to make (any spreadsheet or database has an ‘Export to .csv’ option and programmer can easily code to make one), can be opened and viewed in any text editor and can’t contain viruses or other nasties. They are so common, anyone receiving a .csv should know how to handle it (no follow up questions about how to import the data). .CSV can be very large but compress nicely in ZIP or RAR.
They are called ‘comma delimited’ files but, as we’ll see, various characters can be used to separate values. Excel’s import wizard defaults to Tab as the separator. Any character can be used as the separator but Tab, Comma or Semicolon are the most common.
Don’t do it!
Excel can open .csv files directly. The .csv extension is associated with Excel so double-clicking on one will open it. But that imports the data using the default settings and no control over any data conversion or typing.
The better and safer option is to import a .csv into a new or existing worksheet. This will give you some options to control the data conversion process.
Comma Delimited Import
Go to Data | Get External Data | From text
Choose the .csv file to import and the Text Import Wizard will start.
Excel is pretty clever at figuring out the right settings for importing. One thing you should check is whether the first row of the .csv is real data or labels to describe the data (a header).
The data preview shows how Excel will separate the values using the current selection. As you can see above, the default Tab delimiter doesn’t help. Let’s try with the comma instead.
Now we’re in business. The data is shown aligned into columns which is what we want.
The next wizard screen looks at the conversion of the values into Excel cells. As you can see, the default is the ‘General’ format for each column.
We suggest explicitly choosing the data format for all possible columns. It can be done later but better to get it right from the start. The most important one to choose is any date column and select the appropriate date format. In this case that could be either DMY or MDY because the dates could be either!
The text columns (Product Code and Comments) are marked as Text format. It’s not really necessary for the Comments but is vital for the Product Code column, which has values that would otherwise be considered numbers.
The Sales column is values so leave it in the General format. Excel will import using the number format defined in Windows regional settings. To override that, or just to make sure, click on Advanced to define the decimal and thousands separators to use.
Finally, modern Excel will give you many options for displaying the imported data.
And even more choices under Properties, including the ability to ‘refresh’ the data from a changing data source.
Unless you’re really confident about the imported data, it’s best to import to a new worksheet. There you can check the import, change data formats etc before doing more work with the information. Here’s our sample data just after import into Excel.
There’s still some work to be done to make this data consistent and usable (called ‘normalization’ by us geeks). It’s a topic for another time but here’s the result after some normalization and formatting.
Now you can filter, chart, PivotTable or analyse the .csv data to your hearts content, knowing there much less risk of data or calculation mistakes.
Importing data can be tricky. If possible, have the import data formatted in a way that avoids conversion mistakes.
- Make the date clear and explicit, for example having the month as short text; Jan, Feb, Mar etc.
- Numbers with no comma separators for thousands (especially in comma delimited data)
- Watch for ‘stray’ commas in comma delimited data. Comments or notes fields often contain commas which Excel will assume marks the end of that data field.
Alas, you often don’t have control over the formatting of data you want to import. If that’s not possible, carefully check the results to make sure Excel has got it right.