Current Restriction in Microsoft Purview Unity Catalog Scanning
As of now, Microsoft Purview only supports scoped scans at the catalog level when working with Azure Databricks Unity Catalog. This means:
You cannot directly filter scans by schema or table within Unity Catalog.
The scan setup UI does not offer schema-level or table-level filtering.
Custom scan rule sets do not support table filters for Unity Catalog scans.
Workarounds and Recommendations
While schema-level filtering is not natively supported, here are some practical workarounds:
1. Split Catalogs Strategically
- Consider splitting your Unity Catalog into smaller, purpose-specific catalogs. This allows you to scope scans more narrowly and reduce scan volume.
2. Use Managed Access Controls
- Apply access controls within Unity Catalog to restrict visibility and access to only relevant schemas and tables. This doesn’t limit scanning but helps manage exposure.
3. Automate Filtering via Scripts
- Use automation or scripting to post-process scan results and isolate the tables of interest. This can help simulate filtering even though the scan itself is broad.
4. Leverage Lineage Tracking
- Utilize lineage tracking to monitor how data flows across tables and schemas. This helps in governance and identifying critical data paths.
5. Use Hive Metastore for Schema-Level Scans
- If schema-level granularity is essential, consider using the Hive Metastore connector instead of Unity Catalog. Hive supports schema-level filtering, unlike Unity Catalog.