Hi!

It's your weekly set of articles, my observations about data platforms, primarily focused on Azure cloud. 

Important: Mastermind #3 session about Lakehouse on Azure Data Platform next Thursday (May 27 14:00 UTC). Sign up here or just reply "YES" to this email.

How do you approach Lakehouse implementation on Azure Data Platform? Do you use Synapse, Databricks, Azure Data Lake Gen2, something else? Let's share experiences and your findings!


Summary

  • Open-source and commercial data quality packages and tools
  • Domain-driven data architecture at Disney - must watch!
  • Latest Power BI releases
  • Master Data Management in Azure
  • Python tips and frameworks
  • Data communities to follow

Data Quality Tooling

The last Data Platform mastermind session was about DataOps. I noted down a few data quality tools that you might want to check too:


Domain-Driven Data Architecture at Disney

If you are about to embark on a data mesh journey, take a look to find some inspiration how Disney approached it. Really extensive material in 30 mins. 

https://www.youtube.com/watch?v=iFiidvkpwAI

I was planning to put a short summary, but the content was sooo good that I think it's better if you watch it yourself!

While I am still on Disney topic - one of the best books I've read this year has to be Robert Iger's, former Disney CEO, memoir


Microsoft Business Application Summit Recap

Although I'm not a day-to-day PowerBI user, there plenty of great stuff been released lately. Read more about all the upgrades:

https://powerbi.microsoft.com/en-us/blog/microsoft-business-application-summit-recap/

And here's James Serra's condensed summary with top ten Power BI upgrades and explanations:

http://www.jamesserra.com/archive/2021/05/whats-new-with-power-bi/

But there is still one feature I have to wait for - embedding Power BI in Jupyter Notebooks!


Master Data Management in Azure

Microsoft BI stack data professionals are familiar with SQL Server Master Data Services - a solution for master data management. Azure data stack does not offer any managed offering.

There are gossips that Azure Purview will have it built-in, but up until then, you might take a look at these two COTS offerings featured by Microsoft:


Python Engineering Tips

https://github.com/SigmaQuan/Better-Python-59-Ways

I think it's a brilliant list of must-know Python patterns and how-tos:

  • Prefer exceptions to returning None
  • Use packages to organize modules and provide stable APIs
  • Avoid else blocks after for and while loops
  • and 56 other tips!

https://github.com/vinta/awesome-python

A curated list of awesome Python frameworks, libraries, software and resources

https://github.com/ml-tooling/best-of-ml-python

This curated list contains 840 awesome open-source projects for ml, analytics, data.


Data communities to follow


Valdas Maksimavičius

IT Architect & Microsoft Data Platform MVP

https://www.dataplatformschool.com 

Vilnius
Lithuania

This email was sent to | Unsubscribe | Forward this email to a friend