Skip to content

Desktop Searching - Part 3

We look at the most popular of the tools for MS Office users – Google Desktop Search

In part one of this special Office Watch series we looked at the possibilities of the new range of desktop searching tools, how the work, what to look for, security implications and pitfalls.

Then in part two we checked out the major contenders in the computer search stakes – with one notable exception.

Now in part three we look at the most popular of the tools for MS Office users – Google Desktop Search.

GOOGLE DESKTOP SEARCH

Google has brought fast desktop searching to the masses and, despite some limitations; they’ve done a real good job.

In short they really have brought what we love about web searching with Google to searching your own computer with Google.

It’s hard to under-estimate the difference that Google Desktop has made around Office Watch World Headquarters (ie Peter’s Thinkpad in cafe’s around the world). Little bits of information on your computer are now accessible in a way we could only dream of before. For example, finding the name of a hotel in Venice from 5 years ago used to involve lots of brain-ache and finally a call or email to a friend – now typing ‘Venice hotel’ into Google Desktop shows the relevant documents and emails in a flash.

Finding information buried in Word, Excel, Powerpoint and Outlook is now so much easier than it was a few months ago. It’s truly invaluable for any Microsoft Office user and light-years ahead of anything from Redmond.

For us, Google Desktop gets used like that every day and it is a godsend. Reading the messages from Office Watch readers indicates that there’s plenty of other fans of the product as well. With that popularity and usefulness, we think Google Desktop deserves its own special coverage in Office Watch.

GETTING STARTED

Google Desktop Search is available free from Google and is only about 450kb – tiny by modern standards. All Office programs and Internet Explorer must be shut down during the installation. It only works with Windows XP and 2000. Outlook should be restarted immediately after Google Desktop has installed because Outlook content can only be indexed when Outlook is running.

NOTE: Google Desktop Search is a beta product, a fact often overlooked by users and reviewers. As a result many of the features are not fully documented or developed. You’ll see parts of this issue were we’ve had to make educated guesses about GDS based on our tests. While we’ve found it stable and invaluable, keep in mind that it’s not a final product, despite the popularity and publicity.

You can specify folders to exclude from the indexing. Google Desktop seems to index only Drive C: – not networked or mapped drives. Though there’s no way to specify which drives to index – only what not to index. An offline copy of a user profile (eg My Documents synched with a Windows Server) IS included in the index.

Some people can get confused between ‘normal’ Google for web and the desktop search. Desktop isn’t the Windows desktop, Google uses the word to mean your local computer.

The program installs and does its initial indexing quietly in the background, pausing until after any user activity before resuming the task. You can check its indexing progress at any time by right-clicking the icon in the taskbar and choosing Status.

We found the indexing unobtrusive but have to report some readers found the Google Desktop to be a drain on their systems. Our tests didn’t show that despite the presence of over 9GB of text files on the test computer, Google Desktop indexed the lot while we worked and there was no noticeable effect.

SIDEBAR YOUR HONOR: “YOUR MILEAGE MY VARY “

All through this series we’ve said that one persons experience with indexing and the effect on your computer can be very different from someone else’s. While we’ve had bad experiences with Copernic and X1 we’ve acknowledged that other people have not had the same troubles.

In America they say “Your mileage may vary” after a disclaimer on car ads.

Our experience and that of many readers to Office Watch on this topic is that Google Desktop is the least intrusive of the indexing products we’ve mentioned. However you might have a different experience – it will depend on various factors, many of which are a mystery at this stage.

WHAT IS INDEXED?

Google Desktop can search …

  • Outlook email

  • Outlook Express email

  • AOL Instant Messaging

  • Word documents

  • Excel worksheets

  • PowerPoint presentations

  • Text and other files like RTF.

  • Web history / cache

  • Optionally including secure web pages (HTTPS)

Microsoft Outlook has to be running when indexing is occurring in order to include Outlook email. Only email folders like Inbox and Sent Items seem to be included. Sub-folders below the Inbox ARE included in the index.

You cannot add more extensions to the list to be indexed. That’s a pity because it would let your index any text regardless of filename. This is a useful feature in Copernic Desktop that many WOW readers like. Many Copernic users report that adding .one to the extension list will index OneNote files.

Filenames only are indexed for JPG, GIF, WMA, MP3, AVI and PDF files and possibly some others, however none of the data inside these files (meta-data in images or text in PDF’s) is indexed. The omission of PDF’s especially is a major mark against Google Desktop. Indexing XML/RSS data would have been nice.

The Microsoft-centric view of Google Desktop’s accepted file types is understandable but unfortunate. There’s plenty of other type of files beyond those used by Windows and MS Office and to stay ahead of the pack Google Desktop must have broader coverage.

You can see the index status from the Google Desktop icon, choose Status. On the same menu you can choose Pause Indexing which will suspend the process for 15 minutes – a handy option if you find the indexer isn’t as subtle as you’d like.

All Google Desktop operations and settings are done via your web browser even if your computer is offline. If you are offline and try to do a search, you might get a warning that your computer is offline – just click ‘Connect’ and your browser will open with the Google Desktop search window.

Sadly the help pages are on their web site so you can only check them when on the net.

SEARCHING

Aside from simple searches you can use the more complex operations that are a normal part of Google web searches.

There’s a green, yellow and red icon in the system tray (bottom right of taskbar) that lets you access the search window (actually a web page opening in Internet Explorer), view the indexing status, change preferences or pause the indexing for 15 minutes.

All the usual Google commands apply plus the special ‘Filetype:’ option detailed below.

There doesn’t seem to be any way to limit the search by date, however the results can be sorted by date or relevance (see right of results pane, below the blue bar showing the number of results found).

Click on the options shown to see the results by type – All, Emails, Files, Chats and Web History.

Snapshots of web pages appear next to those results. Abstracts of the text appears under each result on the Google Desktop results page – this is a widely appreciated feature that lets you identify the document you want much faster than a traditional preview pane.

You can search your computer separately or as part of a normal Google web search (though you can switch this off if you wish). The results are quick and accurate however including Google Desktop in a web search can slow the process down slightly.

Sadly the famous Google Toolbar has not yet been updated to allow Desktop Searches from there.

There’s a real opportunity for someone to develop an ‘advanced search’ page for Google Desktop. Something that exposes the command line options in a menu format like the Advanced Search page does for regular Google. Or even clone the new search options on the MSN Search beta.

FILETYPE

You can specify the type of data to search using the filetype: command in your search. For example: ” filetype:word mineral ” will show any Word .DOC files with the word mineral in it. What is searched depends on what Google Desktop is capable of – for example filetype:doc will look inside the contents of .doc files. For filetype:pdf will only check file names of Acrobat files.

Known filetypes are:

  • filetype:word and filetype:doc

  • filetype:excel and filetype:xls

  • filetype:powerpoint and filetype:ppt

  • filetype:text and filetype:txt

  • filetype:email

  • filetype:chat

  • filetype:web and filetype:htm and filetype:html filetype:jpg filetype:gif filetype:pdf filetype:wma and filetype:wmv

  • filetype:mp3

  • filetype:avi

Notice that the Google people are smart enough to have filetype: alternatives built-in so that whether you use ‘Word’ or ‘DOC’ you’ll get the answers you expect.

But the current version of Google Desktop requires the ” filetype: ” parameter to come first.

” Filetype:doc fred dagg ” will work as expected

” Fred dagg filetype:doc ” will not

CACHING

Strictly speaking Google Desktop (like Google on the web) doesn’t index documents, web pages etc. It indexes the copy held in the programs cache. Usually this is irrelevant but worth keeping in mind if you can’t find something that you know is on the computer.

We’ve had cases where Google Desktop won’t return results from a document that is definitely on the computer. We’re not sure why the doc isn’t included in the cache and there seems no way to force it to be included (opening and re-saving the document might work).

There doesn’t seem to be a way to empty or reset the cache. Uninstalling then reinstalling Google Desktop may do it.

Importantly, Google Desktop keeps past cached versions of documents. This is handy but I wouldn’t rely on it. If you need proper version control use the features provided in Word and Excel.

If you search for a term you may see a link next to a result saying ‘2 cached’ or more. Click on that link to see previous versions of that document as text only renderings.

SECURITY CONCERNS

As we mentioned in Part 1 of this series – the speed of Google Desktop makes your computer more accessible to others if you leave the machine unattended and unlocked. This was leapt upon by the mainstream media as a security breach, but it’s not Google’s fault.

There are other legitimate security concerns which Google is aware of and we can hope they’ll address these before the next release.

They already have to some extent. The series of digits in the url of any search is a protection against misuse when including your desktop and web results on the same page. If you change or remove those digits it will yield an error page instead of results. It also explains why you can’t tinker with a Google Desktop url to refine the results manually.

The fact that the local web site is always the same and the index is in a fixed location is a worry. It means a baddie could check if you have Google desktop because the local url is always 127.0.0.1:4664 . Similarly any security exploit that gives access to files with known names or locations can be used to grab data from the Google Desktop cache.

It would be better if Google Desktop allowed configuration of the url, port number and file location / names.

These concerns are currently theoretical and not a reason to avoid Google Desktop unless you work with very sensitive material.

The lack of external controls or policies makes Google Desktop a concern for network administrators and I can sympathize the IT managers who have banned it on their network (while using it themselves on the sly).

There are plenty of extra features and tweaks we’d like to see in Google Desktop Search, but the beta version is quite wonderful. What it lacks, is made up for with a simple interface that anyone who has used the web will grasp effortlessly.

 

About this author