How to Handle Tables from the Web in Word 2011 for Mac

By Geetesh Bajaj, James Gordon

When using Office 2011 for Mac, you’ll soon find that Word 2011 can open Web pages that you saved from your Web browser. If a Web page contains an HTML (HyperText Markup Language) table, you can use Word’s Table features. You might find it easier to copy just the table portion of the Web page from the Web document and paste it into a working Word document.

At some point, you might come across a PDF (Portable Document Format) file that has valuable table information in it that you want to extract. If the table information within the PDF is text-based and not a scanned image, you can use the Mac OS X Preview application to take a stab at getting the table information. Follow these steps:

  1. Open the PDF file in Mac OS X Preview application.

  2. In Preview, choose Edit→Select All.

  3. In Preview, choose Edit→Copy.

  4. Switch to Microsoft Word by clicking Word’s Dock icon or use whichever way you usually use to switch or launch applications.

  5. Make sure you have a new or existing document open.

  6. In Word, choose Edit→Paste.

    You may need to manually delete extraneous information. If text wasn’t pasted, the PDF probably doesn’t contain any text, or is locked, and you can’t use this method to grab the data. If that’s the case, you have to stop here. If text was pasted, continue on.

  7. In Word, select the pasted text that needs to be converted to a table.

  8. Convert text selection to a table by choosing Table→Convert→Convert Text to Table.

    Word makes a table out of the data.

PDFs can contain tables that have been saved as images, as can Web pages and other documents you might find online. If that’s the case, you need Optical Character Recognition (OCR) software to convert the pictures of text into actual text. OCR software isn’t included with Office. Cheap scanners have been known to ship with high-quality OCR software that’s worth even more than the scanner. ReadIris (www.readiris.com) is excellent for OCR.