As with previous versions of SharePoint, SharePoint 2010 will not index the contents of files larger than 16 MB. There are a couple of reasons for this such as network usage pulling large files across and the time it takes to break them apart. While the file itself isn't indexed, the metadata is. So you'll be able to find the location of a 17 MB or larger file by searching for its name, or its author, you won't be able to find it by searching for words that exist in it.
With previous versions of SharePoint, the fix for this was to add a Registry key called "MaxDownloadSize" and put a number between 17 and 64 in it. That tells the search engine to ignore the 16 MB limit, and go ahead and index files all the way up to 64 MB in size. However, in SharePoint 2010 this has changed a bit. The indexer still doesn't download files larger than 16 MB, so that's the same. The way to fix it though is different now. Thanks to the invention of PowerShell we can do that instead of getting our hands dirty in the Registry.
Here's the PowerShell code:
$s = Get-SPEnterpriseSearchServiceApplication
This is what it looks like in practice:
We can see here the default value is still 16 MB, but that is easily changed to something like 25 MB. We also need to bounce the search service for this to take effect. Then after your next full crawl the data in files larger than 16 MB will be indexed.
How do you know if you have documents larger than 16 MB? Unfortunately that seems to have changed for the worse in SharePoint 2010. In SharePoint 2007 if the indexer came across a file larger than 16 MB it would throw a warning in the crawl log. SharePoint 2010 doesn't do this. I haven't found a way to determine which files are skipped because that are larger than the current MaxDownloadSize setting. If anyone knows how to determine this, let me know.