Getting data from Internet into Excel
Office for Mere Mortals / Tuesday, November 01, 2011

Office Watch ebooks - available for you today

 Ever seen some data on the Internet and want to copy it for yourself?  From small tables to large data chunks, there’s several ways to do it.

Simple copying

Almost any table or grid on a web page can be copied into Excel with the web page cells becoming cells in Excel.

For example this, admittedly self-promoting, table:

Title

Normal Price

Subscribers price

Discount

Privacy and Security in Microsoft Office

$16.95

$9.95

$7

Office 2010: the real startup guide

$16.95

$9.95

$7

Access Archon: Working with Excel

$16.95

$9.95

$7

Access Archon: Working with Word

$16.95

$9.95

$7

Effective Outlook Calendars

$7.95

$4.95

$3

Organizing Outlook Email

$7.95

$4.95

$3

Clever Outlook Contacts

$7.95

$4.95

$3

Formatting Magic with Word

$7.95

$4.95

$3

Eye-Catching Signs with Word

$9.95

$6.95

$3

Creative Certificates with Word – premium

$19.95

$14.95

$5

Christmas Cheer with Office – premium

$14.95

$10.95

$4

Valentine’s Day with Word

$9.95

$6.95

$3

Office Backup Handbook

$14.95

$9.95

$5

Simply select the entire table with your mouse pointer. It doesn’t matter if you include some text before and after the table; you can remove the excess later.

Then paste the web table into a blank Excel worksheet.  Here’s the result with the sentence before the table included.

Suppressed Image:

This method only copies displayed data, not any formulas.  The last column ‘discount’ could be calculated but no formulas are copied, only the displayed values. 

As you can see, all the web table data is now in Excel cells. The hyperlinks from the table have been preserved as well as the formatting. But it needs a little work to make it useable.

The top row can be deleted to put the table headings in row A.

Column B is too narrow (that’s why the #### marks appear) widen until the numbers show up.

Widen the other columns so the headings don’t wrap in such an unsightly way.

Tip: to automatically adjust a column to fit the contents, put the mouse pointer on the divider between column headings until you get the twin-arrow icon then double-click.  A similar trick works for rows.

Suppressed Image:

Choose Data | Filter to make the column headings into data labels which you can pull-down on to filter or sort upon.

Notice that Excel has been smart about the currency values.  The $ sign has been stripped out of the web table text and the cell automatically changed to a currency value with $ prefix in the formatting. 

Suppressed Image:

It’s best to check that Excel has done this, especially with other, non dollar, currency signs. 

Not only are the numbers imported as the displayed values only (as opposed to higher precision underlying values in an Excel worksheet) but they are imported as text – it’s up to Excel to figure out what the cell values are and change them accordingly.  It can cope with currency and numbers as we’ve seen.  Dates are usually converted too, but that depends on the formatting of the text values.

 

You can use Excel formatting commands to change the look of the data from whatever was on the web page.  The formatting you get depends, in part, on which browser you copy from.  No surprise that the most consistent results come from using Microsoft’s Internet Explorer.

Doctor Who and the curse of the evil worksheets

There are plenty of publically available worksheets available.  Google Docs online spreadsheet application encourages people to make worksheets public.  They make is easy to do in a way that Microsoft does not.

 

Some of the data is serious and important but other worksheets are more quirky.  For example this worksheet of all the villains and monsters from the long running UK series, Doctor Who courtesy of The Guardian.

 

The entire list is available from Google docs here.

Suppressed Image:

You can select cells or the entire list and paste into Excel, as described above.  However that’s clumsy for a large worksheet and doesn’t include any cell formulas.

The easier method is to download the whole worksheet. In Google Docs you can do that from the File | Download | Excel menu,

Suppressed Image:

Then you can open the XLS file in Excel and format to your heart’s content. 

Suppressed Image:

Downloading a worksheet should also include any formulas.

Data.gov

Not all public data is so frivolous.  Data.gov is a web site full of data sets from the US government.

Many are available as Excel XLS downloads, like these details of oil and gas production.

Suppressed Image:

This is very raw data – over 37,000 rows of information and cryptic headings like “conden_prod_BBL”.

There are explanatory documents linked from the download page.

The Guardian Datablog

The Guardian newspaper in the UK has an interesting blog about data sources and presenting them in interesting ways.

For example a visualization of the cause of death in the UK which includes near the bottom, a link to a worksheet with all the raw data.  Readers are welcome to download the data for themselves.

The World Bank

Another data source is The World Bank where you can download Excel worksheets compressed into ZIP files for faster downloading.

 

Found any interesting data sources on the web?

Let us know via our Feedback page – please tell us the web link and the data collections you think are interesting or fun.

 

Going the other way?

We have an Office-Watch.com article on copying data from Excel into a web page.

 

 

To save bandwidth, we have suppressed any images in the PDA version of this article. Click here to see the uncompromising version of this article (in a desktop browser).




Office Watch - mobile edition - Home
Office Watch
Office for Mere Mortals
Email Essentials
Buying Office
Office 2010 - the next MS Office Winks
Office News Wire

Mobile edition of Office Watch, your independent source of MS Office news, tips and information - Copyright Office Watch 1996-2014

Use this link to see the full size browser version of Office Watch