In the world of SharePoint, the 5000 item per view limit is a well-known challenge, the stuff of legend. To recap, while a SharePoint list or library can have up to 30 million items in it, SharePoint refuses to show you more than 5000 of them at a time. That’s an API level control to protect the backend, so it won’t do this in a view in a web page, an API call through PowerShell or CLI, nothing. While there are several ways to navigate around this limit, such as using CAML queries, these methods often fall short when dealing with really large datasets. This blog post will explore a unique solution to this problem using PowerShell, specifically focusing on the use of Search to retrieve data.
The Challenge
Recently, I was working with a client, a law firm, who had a whopping 7.8 million items in a SharePoint document library. Of course not a great Information Architecture. We were helping them fix that. Among other horrors, over the years, attorneys had copied the contents of CDs and DVDs to various places in SharePoint, creating a massive and complex data structure. The challenge was to find all these so we could migrate them out or delete them. However, due to the sheer volume of data, there was no way to slice the data using the normal tools to get back fewer than 5000 results. This is where PowerShell swoops in and saves the day.
The PowerShell Solution
The solution came in two parts. We were looking for the DVDs by looking for the autorun.inf file in the root. I discovered I couldn’t using something like Get-PnPListItem to find all of the autorun.inf files because I couldn’t find a way with CAML, or anything else to pare the result set down below 5000. There were just too many files. However, I did discover that I could find them in the Search Center, which gave me the first idea. Get them from Search in PowerShell. I used the Submit-PnPSearchQuery cmdlet to send a search query to SharePoint and it retrieves all the results. It gets them in batches of 500 (the maximum for a single search request), with the option of making multiple requests if necessary to retrieve all results. This worked pretty well, but was tedious because I could only get 500 at a time and there were thousands. I had to modify the search, run it again, and append those results to the results from the previous searches. That was too much work, and led to the second part. I wrote a PowerShell function called Submit-PnPSearchQueryAll. This function uses the Submit-PnPSearchQuery cmdlet to send a search query to SharePoint and retrieves all results, paging through them and running multiple queries as needed. If the -ShowProgress switch is provided, the function will display the total number of results and a progress bar.
Before we look at the function itself, let’s see how it works. My usage looked like this:
$AllResults = Submit-PnPSearchQueryAll -query "autorun.inf"
That stored every file named autorun.inf in the variable $AllResults. There are 5046 of them:
$AllResults.Count
Since that’s a collection of objects I can treat them like any old object:
$AllResults[1000]
and
$AllResults | Where-Object {$_.ParentLink -like "*AttorneyFiles/Johnson, Michael*" }
or
$AllResults | Where-Object {$_.ParentLink -like "*AttorneyFiles/Johnson, Michael*" } | select Path,ParentLink
and
$AllResults | Where-Object {$_.ParentLink -like "*AttorneyFiles/Johnson, Michael*" } | Export-Csv .\mj.csv
See all the fun you can have? Since the object we’re getting back is a PnPResultTable object, it doesn’t have all the same properties as a PnPListItem. When I wrote the function I had to decide which ones I needed. If you use this, you might need something different.
How It Works
The function begins by initializing variables for the starting row and page size, set to 500. It then enters a loop where it performs the search query with Submit-PnPSearchQuery and retrieves the results. If the –ShowProgress switch is provided, it will display the total number of results on the first run and a progress bar for each subsequent run.
For each result, the function outputs a custom object with the desired properties. It then increments the $startRow by the $pageSize and continues the loop while the $startRow is less than the Total Rows.
Before I got this working I tried a couple of other approaches, but this one worked the best.
Grab the PSM1 file with Submit-PnPSearchQueryAll here.
Conclusion
This PowerShell function proved to be an effective solution to the SharePoint 5000 item limit, allowing us to retrieve all items from a massive SharePoint document library. It demonstrates the power and flexibility of PowerShell and SharePoint's Search functionality when dealing with large datasets. Whether you're dealing with millions of items or just want a more efficient way to retrieve data from SharePoint, consider giving this function a try.
tk
ShortURL: https://www.toddklindt.com/Posh5000ItemLimit