Office for Mere Mortals helps people around the world get more from Word, Excel, PowerPoint and Outlook. Delivered once a week. free.
An example of web scraping from BestBuy.com into Excel.
We’ve already talked about getting prices from the Amazon.com web site into Excel with automatic updates. Now let’s look at how to do it for BestBuy.com prices.
Excel has had a method of web scraping for many versions you can find it at Data | Get External Data | from Web.
Place your cursor in the cell where you want the scrape results to go. For reasons that’ll become obvious, it’s best to insert the results into a separate worksheet for each product.
Go to Get External Data | From Web. Paste in the link to the Best Buy product page you want to check and press Go. You can browse in the Web Query window but it’s usually easier to find the page in a normal browser then paste the link into the Address field of the Web Query.
As you can see above, there’s only one main table you can select on the Best Buy Page (unlike Amazon product pages which have many smaller tables to select from). Click on the arrow top left to select most of the web page, then Import to continue.
As you can see, the web page is imported with text lines in each row. That’s why we suggested importing each page to a separate worksheet.
If you scroll down, you’ll see the price in a separate cell, approximately row 278.
Notice that Excel has imported the number 829.99 and automatically converted it to Currency formatting.
Get the price
How to extract that price from all that mess of data?
If you’re working on a small scale, with just a few products for your personal comparison, it’s probably enough to find the right cell (A278) and link to it from your main analysis sheet. Hopefully the cell reference won’t change over a short period of time, if it does change you’ll get an error which is easy to see and fix.
To automate finding the price cell the best we could do is using the MATCH() function to search for the cell ‘Free Shipping’ then subtract one to get the row above that. =MATCH(“Free Shipping”,A:A,0)-1 returns the row reference (278) not the value. Use INDEX to get the value from the cell reference e.g.
=INDEX(‘Lenovo from Best Buy’!A:A, MATCH(“Free Shipping”,’Lenovo from Best Buy’!A:A,0)-1 )
Office Watch has the latest news and tips about Microsoft Office. Independent since 1996. Delivered once a week.