Data integration is the art and science of combining data from disparate data sources for particular business purposes.
These purposes include
> support for data warehousing operations such as extract, transform and load (ETL) tasks
> support for real-time and batch application interfaces
> support for performing operational tasks across the enterprise
Integral to Data Integration is the concept of Data Quality. As data is being moved from one system to another, the quality of that data can be enhanced many folds by cleansing it, enriching it with information from external systems (such as demographics or zip code-specific data), removing redundant data (deduplication) and labeling the data appropriately (generating metadata). Moving data between systems can be challenging due to the volatility and volume of data in the source system. To aid with this, Data Integration can take advantage of change data capture (CDC) technologies to access only the data rows that changed among billions, thereby speeding DI operations.
Kimball Warehouse Toolkit Classics is a collection of books published by the the Kimball Group on data warehousing design and development. We apply the Kimball techniques in our design projects and continuously stay abreast with new technology and developments in the datawarehousing market as the practice matures.
The Data Warehouse Institute is a well-respected, platform-agnostic thought leader in the business intelligence and datawarehouse markets. In addition to providing great training on the principles of data warehousing and business intelligence, they are a great access point for industry news as well as weekly webinars and podcasts on hot BI / DW topics.