Data Governance and Metadata framework for Hadoop
Atlas is a scalable and extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem.
- Import or define taxonomy business-oriented annotations for data
- Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes
- Export metadata to third-party systems
- Capture security access information for every application, process, and interaction with data
- Capture the operational information for execution, steps, and activities
Search & Lineage (Browse)
- Pre-defined navigation paths to explore the data classification and audit information
- Text-based search features locates relevant data and audit event across Data Lake quickly and accurately
- Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information
Security & Policy Engine
- Rationalize compliance policy at runtime based on data classification schemes, attributes and roles.
- Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions
- Column and Row level masking based on cell values and attibutes.
Apache Atlas is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.