Amazon EMR and AWS Glue added native support for audit-context logging along with AWS Lake Formation credential vending APIs and AWS Glue Data Catalog GetTable / GetTables API calls.
This new capability is enabled by default, allowing organizations to easily incorporate audit logging into existing workflows within the data lake while improving compliance, governance, and security. The audited context is made available in AWS CloudTrail logs to provide additional traceability and monitoring of the operations happening on the data lake.
Key Benefits & Compliance Advantages
• Regulatory Compliance & Governance: Audit logging is crucial for the compliance of regulatory frameworks, such as the DMA and other data protection regulations.
• Job-level visibility: The audit logs automatically capture critical metadata, including platform type, such as EMR-EC2, EMR on EKS, EMR Serverless, or AWS Glue, along with identifiers like Cluster ID, Step ID, Job Run ID, or Virtual Cluster ID.
• Improved Troubleshooting & Access Auditing: With detailed context linked to each Spark job, security and data engineering teams are able to correlate API calls with the exact job executions, diagnose fine-grained access control issues, and analyze historical data access patterns across a variety of compute environments.
• Broad Availability: This functionality is available in all AWS Regions where Amazon EMR, AWS Glue and Lake Formation are supported – provided the deployment uses EMR version 7.12+ or AWS Glue version 5.1+
Also Read: DataGroomr Enhances Its AI-Powered Data Quality App on Salesforce AppExchange
Why This Matters for Enterprises
As data lakes become bigger and more complex, companies rely more on big-data tools like Spark and distributed platforms. So, thorough auditing is now essential. AWS helps companies maintain high governance standards. It does this by adding audit-context logging to EMR and AWS Glue. This means less extra work for them.
Tracking job access to specific data, compute types, and access patterns ensures transparency, compliance, and security in regulated industries. This feature lets organizations make the most of cloud-native data lakes. They can scale easily and stay flexible. It also ensures they meet audit and compliance requirements.


