A modern data architecture, often referred to as an Enterprise Data Lake or Enterprise Data Hub, is an agile data infrastructure that delivers a coherent view of data assets with broad functionality to ingest, consolidate and process raw, un-modeled or lightly modeled data. The modern data architecture adopted by the enterprise should essentially guarantee a set of capabilities that enable the enterprise to operationalize and extract value out of its data assets. These capabilities include:
- Data Catalog: Know what data is available and how it can add value to the organization.
- Data Lineage: Know where, when and how data is moved and consumed not just within the data warehouse or data lake, but across all downstream business functions, and the wider enterprise.
- Data Quality: Enforce policies and processes around data acquisition, transmission, consumption and disposition using automation, while reporting on key metrics through real-time analysis.
- Meta Data: End-to-end visibility, audit and traceability on all kinds of metadata while maximizing analytics performance.
- Ingestion: Accommodate a vast variety of Big Data, in various formats, structures and attributes, from a variety of sources, and yet enable efficient processing, query speed and precision.
- Analytics: Synthesize and master the available data and provision actionable insights when and as required.
- Data Security: Establish policy based security and access controls for end-to-end data audit, authentication and protection.