In this post, we will cover a new feature in preview for Microsoft Fabric: Lakehouse Schemas. If your Lakehouse is drowning in a sea of unorganized data, this might just be the lifeline you’ve been looking for.


The Data Chaos Conundrum

We’ve all been there. You’re knee-deep in a project, desperately searching for that one crucial dataset. It’s like trying to find your car keys in a messy apartment; frustrating and time-consuming. This is where Lakehouse Schemas come in, ready to declutter your data environment and give everything a proper home.


Lakehouse Schemas: Your Data’s New Best Friend

Think of Lakehouse Schemas as the Marie Kondo of the data world. They help you group your data in a way that just makes sense. It’s like having a super-organized filing cabinet for your data.

Why You’ll Love It

  • Logical Grouping: Arrange your data by department, project, or whatever makes the most sense for you and your team.
  • Clarity: No more data haystack. Everything has its place.

The Magic of Organized Data

Let’s break down how these schemas can transform your data game:

Navigate Like a Pro

Gone are the days of endless scrolling through tables. Prior to this feature, if your bronze layer had a lot of sources and thousands of tables, being able to find the exact table in your Lakehouse was a difficult task. With schemas, finding data just became a lot easier.

Pro Tip: In bronze, organize sources into schemas. In silver, organize customer data, sales data, and inventory data into separate schemas. Your team will thank you when they can zoom right to what they need.

Schema Shortcuts: Instant Access to Your Data Lake

Here’s where it gets even better. Schema Shortcuts allow you to create a direct link to a folder in your data lake. Say you have a folder structure on your data lake like silver/sales filled with hundreds of Delta tables, creating a schema shortcut to silver/sales automatically generates a sales schema. It then discovers and displays all the tables within that folder structure instantly.

Why It Matters: No need to manually register each table. The schema shortcut does the heavy lifting, bringing all your data into the Lakehouse schema with minimal effort.

Note: Table names with special characters may route to the Unidentified folder within your lakehouse, but they will still show up as expected in your SQL Analytics Endpoint.

Quick Tip: Use schema shortcuts to mirror your data lake’s organization in your Lakehouse. This keeps everything consistent and easy to navigate.

Manage Data Like a Boss

Ever tried to update security on multiple tables at once? With schemas, it’s a walk in the park.

Imagine This: Need to tweak security settings? Do it once at the schema level, and you’re done. No more table-by-table marathon.

Teamwork Makes the Dream Work

When everyone knows where everything is, collaboration becomes a whole lot smoother.

Real Talk: Clear organization means fewer “Where’s that data?” messages and more actual work getting done.

Data Lifecycle Management Made Easy

Keep your data fresh and relevant without the headache.

Smart Move: Create an Archive or Historical schema for old data and a “Current” schema for the hot stuff. It’s like spring cleaning for your data!

Take Control with Your Own Delta Tables

Managing your own Delta tables is added overhead, but gives you greater flexibility and control over your data, compared to relying on fully managed tables within Fabric.

The Benefits:

  • Customization: Tailor your tables to fit your specific needs without the constraints of Fabric managed tables.
  • Performance Optimization: Optimize storage and query performance by configuring settings that suit your data patterns. Be aware that you must maintain your own maintenance schedules for optimizations such as vacuum and v-order when managing your own tables.
  • Data Governance: Maintain full control over data versioning and access permissions.

Pro Tip: Use Delta tables in conjunction with schemas and schema shortcuts to create a robust and efficient data environment that you control from end to end.


Getting Started: Your Step-by-Step Guide

Ready to bring order to the chaos? Here’s how to get rolling with Lakehouse Schemas:

Create Your New Lakehouse

At the time of writing this, you can not enable custom schemas on existing Lakehouses. You must create a new Lakehouse and check the Lakehouse schemas checkbox. Having to redo your Lakehouse can be a bit of an undertaking if all of your delta tables are not well-organized, but getting your data tidied up will pay dividends in the long run.

Plan Your Attack

Sketch out how you want to organize things. By department? Project? Data type? You decide what works best for you and your team.

Create Your Schemas

Log into Microsoft Fabric, head to your Lakehouse, and start creating schemas. For folders in your data lake, create schema shortcuts to automatically populate schemas with your existing tables.

Example: Create a schema shortcut to silver/sales, and watch as your Lakehouse schema fills up with all your sales tables, no manual import needed.

Play Data Tetris

If you choose not to use schema shortcuts. you can move any tables into their new homes. It’s as easy as drag and drop. If you are using schema shortcuts, any shifting of schemas would occurs in the data lake location of your delta table in your data pipelines.

Manage Your Own Delta Tables

Consider creating and managing your own Delta tables for enhanced control. Store them in your data lake and link them via schema shortcuts.

Stay Flexible

As your needs change, don’t be afraid to shake things up. Add new schemas or schema shortcuts, rename old ones, or merge as needed.


Pro Tips for Schema Success

  • Name Game: Keep your schema names clear and consistent. Work with the business around naming as well to help prevent any confusion around what is what.
  • Leverage Schema Shortcuts: Link directly to your data lake folders to auto-populate schemas.
  • Document It: Document what goes where. Future you will be grateful, and so will your team.
  • Team Effort: Get everyone’s input on the structure. It’s their data home too.
  • Own Your Data: Manage your own Delta tables for maximum flexibility.
  • Stay Fresh: Regularly review and update your schemas setup and configuration.

The Big Picture

Organizing your data isn’t just about tidiness—it’s about setting yourself up for success.

  • Room to Grow: A well-planned schema system scales with your data.
  • Time is Money: Less time searching means more time for actual analysis.
  • Take Control: Managing your own Delta tables adds some overhead but also gives you the flexibility to optimize your data environment and more control.
  • Instant Access: Schema shortcuts bridge your data lake and Lakehouse seamlessly.
  • Roll with the Punches: Easily adapt to new business needs without a data meltdown.

Wrapping It Up

Microsoft Fabric’s Lakehouse Schemas and Schema Shortcuts are like a superhero cape for your Lakehouse environment. They bring order to chaos, boost team productivity, and make data management a breeze.

Remember:

  • Schemas create a clear roadmap for your data.
  • Schema shortcuts automatically populate schemas from your data lake folders.
  • Managing your own Delta tables gives you more control and efficiency.
  • Your team will work smarter, not harder.
  • Managing and updating data becomes way less of a headache.

So why not give the new Lakehouse Schemas feature a shot? Turn your data jungle into a well-organized garden and watch your productivity grow!


Happy data organizing!

This article will help guide you to go green with your Azure VMs and save some green! ????

When building a cloud solution, many different items need to be considered. One of these items, which will help guide you in the right direction, would be following the Azure Well-Architected Framework. This is a framework all cloud professionals should be informed about. If your solution is already deployed, take the Assessment Review and see where you can improve:

The Azure Well-Architected Framework Pillars:

  • cost management
  • operational excellence
  • performance efficiency
  • reliability
  • security

On top of that, Microsoft is also making considerable steps towards sustainability and aligning with Green Software Principles from the Green Software Foundation.

Today, we will focus on the sustainability and cost optimizations for Azure VMs.

Go Green with Automation to Stop and Start Your VMs

Depending on your workload, you may need to start and stop your VMs on a schedule. A use case can be development servers that only need to run during business hours.

There is already an Auto-shutdown feature within the VM settings within the Operations section. However, there is no Auto-start feature as of writing this article.

Go Green with Automating Start and Stop of your VMs

Within Auto-shutdown, you simply configure a few properties:

There are many options to achieve the Auto-start functionality, but two of the most common and quickest to implement are Automation Tasks and Logic Apps.

With Automation Tasks, you configure Runbooks within your Automation Account and can browse the gallery for a plug-and-play solution. As of the time of this article, Azure Automation Run As Account will retire on September 30, 2023. So be sure to use a Managed Identity if you opt for this solution.

If you go this route, one of the most popular templates looks like the graph below:

Go Green using Azure Logic Apps to start and stop your VMs.

I would opt for a Logic App. They are very user-friendly and have an abundance of connectors and functionality. You can configure them directly within the individual VM settings, but I would not advise doing so.

If you were to configure them per VM, this is how you would do it.

Go to the Automation section of the VM:

Add a task:

Select a template:

Authenticate:

Configure the task:

Creating them in the individual VM is very easy but a pain in the ass to manage since it will create a Logic App per VM. When you start seeing the same logic repeatedly, that should be a clue that you are not abiding by the DRY (Don’t Repeat Yourself) principle.

We will use parameterization to make the Logic App easier to build and manage. We will skip the steps needed to create the Logic App resource. Here is a link to guide you through those initial steps and other information around Logic Apps: Overview – Azure Logic Apps | Microsoft Learn.

For the parameterization, we will create the following parameters:

  • EmailID (string): semi-colon separated list of email addresses
  • Frequency (string): set this as “Day”
  • Interval (integer): set this as “1”
  • Notify (bool): set this as “true”
  • StartTime (string): set this as the first day you want the schedule to start
  • TimeZone (string): set this as the wanted Time Zone
    • VmList: (array): this is an JSON object with the following schema:
      {"resourceGroup": "string", "resourceVM": "string", "subscriptionId": "string"}

The first step of developing the Logic App would be to create the trigger. We will create a “Recurrence Trigger” since we are building this on a schedule. You could also make this event based on your needs with an HTTP Request, email, Service Bus, etc.

Next, we will want to loop on the VmList array. For each iteration, we will want to execute the following logic:

  • Start the VM: passing subscriptionId, resourceGroup, and resourceVM parameters
  • Send Email on Failure or Success if “Notify” parameter is “true.”

For Each Loop:

Notifications:

There you have it. A Logic App set up on a schedule to start VMs on the frequency you decide! If any changes need to be made, you can tweak your parameters. One resource with a friendlier interface with an abundance of connectors/controls/functionality to simply start/stop your VMs. As stated, there are many ways to accomplish this task, but driving towards cost optimizations and less computing; is a great solution.

If you need help saving money on your Azure subscriptions, contact us!

Happy coding!

????

Replay in the Cloud like a RockStar
Replay like a Rockstar!

Want to save money, validate performance, and make sure you don’t have errors while migrating to Azure SQL Database, Azure SQL Managed Instance, SQL Server RDS in Amazon AWS? In this video, you will learn how to use the Data Experimentation Assistant to replay and compare your on-premise or cloud SQL Server workloads on-demand. First, you will learn how to capture a workload on-premise or in the cloud. Next, you will master replaying your workload on-demand as needed. Finally, you will do an analysis of comparing a baseline workload and another workload with your changes.

Replay in the Cloud Video

Workload Replay is the Secret Weapon for a Successful Migration to the Cloud

Slidedeck

Recommended Reads

If you liked this video, check out the following other resources.

I recently spoke at a conference and was asked what is the easiest way to import databases to Azure SQL Database. Therefore, I wanted to share how I do this with DBATools.io. You can use the same code to just export if you need a local copy of an Azure SQL database as well.

Import-Module dbatools -Force
<# Variables #>
$BackupPath = "C:\Demo\AzureSQL\Bacpac" #folder location for backups
$SourceInstance = "sql2019\sql2016"
$DBName = "AdventureWorksLT2012"
$AzureDestInstance = "procuresqlsc.database.windows.net"
$DBNameDest = $DBName <# backpac options for import and export #>
$option = New-DbaDacOption -Type Bacpac -Action Export
$option.CommandTimeout = 0 $option2 = New-DbaDacOption -Type Bacpac -Action Publish
$option2.CommandTimeout = 0 <# The following assums Azure SQL Database exists and is empty Azure will create database by default if it doesn't exist #>
$bacpac = Export-DbaDacPackage -Type Bacpac -DacOption $option -Path `
$BackupPath -SqlInstance $SourceInstance -Database $DBName Publish-DbaDacPackage -Type Bacpac -SqlInstance `
$AzureDestInstance -Database $DBNameDest -Path $bacpac.path ` -DacOption $option2 -SqlCredential username 

What Is my Performance Tier?

Great question, as of 3/3/2020 if the database in Azure SQL Database does not exist then it will be created. When its created the following database uses the default performance tier. This is General Purpose (Gen5) with 2 vCores.

Default performance tier for a new Azure SQL Database costs $371.87 per month.
The default cost of a new Azure SQL Database is 371.87 per month.

How to create cheaper databases

Great question, you can import databases to Azure SQL Database cheaper using PowerShell. It is as simple as using the Azure PowerShell Module. The following example below I use my existing Azure SQL Database server and I end up creating a new database with the “S0” tier.

<# Install Azure Powershell module requires local admin
Install-Module -Name Az -AllowClobber -Scope AllUsers
#> Import-Module Az -Force
$RGName = "<Your Existing Resource Group>"
$ServerName = "<Your Azure SQL Database Server>"
$Tenant = "<Add Tenant ID>"
$Subscription = "<Add your Subscription ID>" Connect-AzAccount -Tenant $Tenant -SubscriptionId $Subscription
$server = Get-AzSqlServer -ResourceGroupName $RGName -ServerName $ServerName $db = New-AzSqlDatabase -ResourceGroupName $RGName `
-ServerName $server.ServerName -DatabaseName $DBName `
-RequestedServiceObjectiveName "S0"

Your Homework Assignment

You got all the nuts and bolts to generate a script that can migrate all your databases on an instance. Then you can import databases to Azure SQL Database in one loop.

Need Help Moving to Azure?

Let us help you! Schedule a free 30-minute chat to see if we can point you the right direction. Take advantage of our free resources too.

If you liked this blog post signup for our free newsletter!

The post Import Databases to Azure SQL Database appeared first on SQL Server Consulting & Remote DBA Service.

 

In my opinion, automation will enhance DBAs.  It will allow DBAs to provide more value to the business.  Good DBAs will be fine.   If you are a DBA with ten years of DBA experience who just repeats year one ten times, this might be a different story for a different blog post.  Therefore, automation enhances the Good DBAs careers just like the sync button enhances turntablists but the sync button doesn’t replace good turntablists’ careers. 

Follow along to see if you agree or disagree. I look forward to seeing your feedback in the comments section on this topic.

Some of my good friends know I have a passion for DJing on the ones and twos.  A subset of those friends know I took a job in high school just so I could buy two Technics 1200s and records. In my free time, I started taking an online DJ Masterclass and it quickly jumped out to me that the sync button for turntablists, club DJs (see just like DBAs there are several different types of DJs) isn’t much different from automation for good DBAs. The sync button is a way to automate the tedious task of making sure two different songs are played at the same speed so they could be mixed together. In the past few years, I started to hear buzz that automation will replace DBAs so I wanted to write a blog post providing more details into a recent tweet of mine.

 

What is this sync Button You Speak Of? How does it relate to DBAing?

Going back to my high school days for a second.  One of the core fundamentals of being a turntablist is using beat matching to keep the dance floor going as you transition from one song to another song. It took me months to learn this fundamental skill. I had to learn how to measure BPM (beats per minute) to see if the songs could even mix together. I had to learn beats (think of pages) and bars (think of extents) and learn how to mix the songs on the same beat on right bars. I had to learn how to adjust the speed of each of the two records to make the BPMs match on my own. When mixing, I had to make sure the two records stayed on beat during the mix.  Syncing records isn’t the hard part of DJing, it’s just a tedious part.

Today, most DJ software packages have a sync button so this process can be automated.  Now we can hit the sync button and the BPMs will match almost all of the time.  What would you do if you didn’t learn the basics and relied on the sync button and the BPM for your song is wrong, or worse, your venue doesn’t have the sync button? I think this would be similar to having corruption occur and not knowing the basics behind DBCC CHECKDB, Backups and Restores. You won’t be keeping people on the dance floor for long, or have a successful DBA Career.

  If you don’t know the basics, you will rely on the software and have no other options if and when the software doesn’t automatically work for you.   

 

I Love Automation, I have loved it for years!

 I love automation because it allows me to automate tedious tasks and focus on ones I love and which provide value. For now, I will talk about automation in general and will focus a bit later on some new features with SQL Server that help automate things.

 I have been using automation as a tool in my toolbox for almost my whole DBA career.  I have given talks on Getting Started with PowerShell for DBAs; my first PASS Member Session was on utilizing Policy-Based Management to automate basic health checks. You can even use 3rd party tools or custom scripts like this, this, thisthis, or this to help with some automation. Just make sure they work for you and your environment. I have seen a nice automated SQL Server installer built a very long time ago that used JavaScript before JavaScript was cool. I bet it actually started with JScript (Insert painful JScript vs JavaScript memories from my web development days).

The language used doesn’t matter, what matters is what the process does.  

At the same time, I have actually seen a software service provider application that in my opinion didn’t monitor transactional log backups correctly, yikes.  I mention this not to scare you, but to remind you that you need to know the basics to know if an automation process works for your needs. If you don’t know what the automated process is supposed to do how do you know it is doing it correctly? Using the wrong automation tool, or using the tools incorrectly, could be very hurtful to your career. This looks a lot like the sync button for DJing to me. Automation tools/scripts also evolve and change so this isn’t a ‘set it and forget it’ thing either, in my humble opinion.

Good, experienced DBAs have been leveraging automation for years.

It has been a tool in their toolbag for quite a while, not a new thing. DBAs also know how to leverage it as a great tool instead of seeing it as a new tool that replaces their jobs.  BIML is a great automation tool for Business Intelligence (BI) Developers.  Does BIML automate good BI Developers out of their jobs? Nope.  Does it help them automate so they can focus more time toward high-value tasks and get more things done? Yep, it sure does.

If automation is new to you as a DBA, or maybe you are starting out as a DBA, I suggest checking out some well-proven open source tools in a non-production environment. If you try to test out automation, several solutions include options to script out but not execute the tasks. This can be very helpful for learning (just like I used the script feature in SSMS to learn a lot about T-SQL behind the GUI).  If you are looking for a place to start, I highly recommend dbatools.io . It’s amazing to look at all the things one could do with this single tool.

Can Self-Tuning Databases Completely Automate DBA Tuning?

Performance tuning is one of my favorite parts of being a DBA; therefore, there is no way I would skip talking about some of the great enhancements that allow SQL Server to tune your own environment in Azure SQL Database and the box product. It’s purely amazing to see some of the new cutting-edge tuning features that, in my opinion, are game changers that will help Microsoft further separate themselves from all other database management systems on the market.  

Don’t hold your breath while you try to read about some of these great performance improvements provided just since SQL Server 2014.  There is cardinality estimator changes, auto tuning for adding and removing indexes, adaptive plan changes, query store, resumable index rebuilds, heck good old Database Tuning Advisor gets better and wait for it…. Automate Tuning. Don’t take my word for it that these features will be amazing. Go check out Lance Tidwell’s  about some of these new features in his PASS Summit talk in November.  Actually, take a second to read his abstract. Notice the following, “We will also look at some of the pitfalls and potential issues that can arise from these new features.” I find it interesting that it looks like Automated Tuning and other features might not safely automate all your tuning needs.

I don’t foresee Self-Tuning Databases being able to self-tune everything.

 A friend of mine Joey D’Antoni wrote a great blog post recently on Self-Tuning Databases. Let’s go by this statement in his blog post, “I’m sure this would take years and many versions to come into effect.” Let’s fast forward and assume this is fully in effect including even multiple features outside of execution plan changes. I will throw in an example, Artificial Intelligence that could find several common known T-SQL Anti-Patterns and rework the code for all of them so the optimizer friendly version executes.

Let’s take a look at why I do not foresee automated tuning fully replacing Performance Tuning DBAs.

The way people implement business processes with technology will prevent automated tuning from fixing everything in your environment. It could be hardware, database design, could be bad coding, bad anti-patterns in code, misuse of ORMs, ORMs struggling to your data model, etc. My point is, I don’t see performance tuning getting completely automated to the point where performance tuning skills will be obsolete.  I am not alone, another good friend of mine, Grant Fritchey writes, “Finally, despite the fact of adaptive plans and automated regressions, you’re still writing code that NO amount of automation can fix.”

 

Will the Cloud Automate DBAs Out of their Jobs?

The cloud will not automate DBAs out of their jobs.  First, let me start with the fact that I love the cloud.  I write about Azure, and I give presentations on Azure SQL Database.  Azure SQL Database has grown a lot over the years and almost has all the same features as the box product. Microsoft has also added great additional features to differentiate themselves from their competitors like Query Performance InsightAutomated Tuning to add and remove indexes, Geo-Replication and more to help make Azure SQL Database my favorite platform as a service offering.  Soon, Azure Managed Instances of SQL Server will even make admin tasks easier for DBAs as they will be able to apply their skills directly with fewer changes like the box product.  

Even though it is a fantastic time to leverage cloud services not all companies are going to put all of their critical data there.  Many are still using SQL Server 2005. Some still work with mainframes. Using the same logic, shouldn’t those DBAs have been automated away years ago? 

migrations to multiple cloud providers and even helped clients spin up their own cloud environments that they offer as a service, while also providing them Remote DBA Services as well.

I know what you’re thinking. Why do I need DBA help if I am in the cloud?

If someone is using a managed service for their databases in a platform as a service model (PaaS) like Amazon RDS or Azure SQL Databases, why would they need a DBA to manage the databases, isn’t the managed service managing them for you? Well, it’s simple, your needs might not be completely aligned with the service level agreements that are offered. Take a second and read Amazon RDS SLA and then Azure SQL Database SLA.  Keep in mind, I am not telling you to not leverage these services; I just want you to know what is covered and what gaps DBAs can still fill. If you also take a look at Amazon RDS Best Practices you see several things that can keep DBAs busy.  

Did you notice any mentions of your data in the SLA’s?  

Is anything mentioned in writing about what will happen when stack dumps occur or data corruption happens? Did you see anything written on how you get notified if corruption happens and what actions will be taken on your behalf that might cause data loss, or if you should make business decisions and take action?  

Thinking this could not happen to you? I ask because while performing our health check on a non-Microsoft cloud PaaS provider for a new client, we detected stack dumps… and this was the first time our client was notified, yikes!

The DBA Profession Is Not Going Away

To wrap this up, I don’t see the DBA profession going away.  The DBA role has pivoted and evolved constantly over the years.  I don’t see that changing.  It’s one of the many reasons why I am passionate about helping clients solve their business problems with SQL Server.  While some think machines are close to automating away your job as a DBAI would hate it if new DBAs or experienced DBAs see written content like this and assume they need to change their career.  Not only do I respectfully disagree, I do not see this happening and I am not alone. 

When you want to focus on growing your skills, I see several benefits towards investing time both inside and outside of the role of a DBA.

 I think automation will enhance good DBAs and reduce time spent on repeatable trivial tasks allowing DBAs to move faster and provide more value.  I don’t see good DBAs being replaced by a handful of PowerShell scripts.  To me, this is the DBA version of the sync button that allows turntablists to spend less time on preparing a mix and more time to providing a better experience to the dance floor.

Seriously, don’t just take my words for it. NPR predicts only a 3% chance of automation taking over DBAs’ jobs.  Even better, take a look at US Bureau of Labor Statistics you will see DBAs are not losing their jobsWhen the tedious parts of both database administration and DJing have been automated away, it leaves people like me more time doing what we love. Embrace automation and find ways to add value so the business loves you.  

Don’t make DBA stand for Don’t Bother Asking or you might get replaced by another DBA or a managed service provider.