AZ-900 Certification Notes

Chapter 9.2 - Big Data

Big Data - Moving Definition

  • Data from Millions of Devices
  • The definition of how big " Big Data" is changes as the industry can process more and more data

Business Value

Big Data = Better service, better products, more profits

Data Lake Analytics

  • Large Amounts of Data
    • A data lake is a very large body of data
  • Parallel Processing
    • Two or more processes or computers processing the same data at the same time. Data Lake Analytics includes parallel processing
  • Ready to Go
    • Servers, processes and any other needed services are ready to go from the start. Jump straight into the data analytics

HDInsights

  • Similar to Azure Data Lake Analytics
  • Open Source, which is free and community supported
  • Includes Apache Hadoop, Spark, and Kafka

Azure Databricks

  • Based on Apache Spark, a distributed cluster-computing framework
  • Run and process a dataset on many computers simultaneously
  • Databricks provides all the computing power
  • Integrates with other Azure Storage services

Azure Synapse Analytics

  • Azure's data warehouse offering
  • Used to be Azure SQL Data Warehouse
  • Used for reporting and data analysis
  • Only limited by your scope
  • Use Synapse SQL language to manipulate the data

Outcomes

  • Speed
    • Speed and efficiency of processing large amounts of data, provides real value
  • Cost Reduction
    • Save large amounts of money on storage and processing, by using a Big Data solution in the cloud
  • Better Decision Making
    • Immediate data processing and analysis in-memory means you can make better decisions, and make them faster
  • New Products and Services
    • Understand what customers want and provide them with much better products and services