Posts

Setting page visibility and the active page are often overlooked last steps when publishing a Power BI report. It’s easy to forget the active page since it’s just set to whatever page was open when you last saved the report. But we don’t have to settle for manually checking these things before we deploy to a new workspace (e.g., from dev to prod). If our report is in PBIR format, we can run Fabric notebooks to do this for us. This is where Semantic Link Labs helps us.

You can download my notebook here. I’ll walk through the steps in this post.

Code Walkthrough

First, we must install semantic-link-labs. If you already have an environment with this library installed, you can use that and skip this step.

%pip install semantic-link-labs

Next, we need to import some modules.

# Imports
import sempy_labs as labs
from sempy_labs import report
import ipywidgets as widgets
from IPython.display import display

Then we can get to work. First, I’m capturing the following information using widgets: workspace ID, report ID, and page name.

w_workspace = widgets.Text( description = 'Workspace ID',style={'description_width': 'initial'}) 
w_report = widgets.Text(description = 'Report ID', style={'description_width': 'initial'})
w_activepage = widgets.Text(description = 'Active Page Name', style={'description_width': 'initial'})
display(w_workspace)
display(w_report)
display(w_activepage)

Running the code above will create 3 widgets. You will need to enter the required information into the widgets.

Fabric notebook widgets that capture workspace ID, report ID, and active page name

You could use variables in a cell to collect the required information. I’m using widgets to make it clear what information needs to be entered.

Once you have filled in the textboxes, you can run the last 2 cells. The fourth code cell is where I’m actually making the changes to the report.

var_reportname = labs.resolve_report_name(w_report.value, workspace=None)

var_rptw = labs.report.ReportWrapper(report=w_report.value, workspace=w_workspace.value,readonly=False)

var_rptw.set_active_page(w_activepage.value)
var_rptw.hide_tooltip_drillthrough_pages()
var_rptw.save_changes()

var_rptw.list_pages()

First, I use the report ID entered into the widget to get the report name.

Then I create my report wrapper (var_rptw). This object will be used with all the subsequent functions.

Next I set the active page to the page name entered into the w_activepage widget using the set_active_page() function. Then I call hide_tooltip_drillthrough_pages().

Each page has associated metadata that indicates whether it is a tooltip page and whether it is a drillthrough target page. I believe the tooltip page is determined by the page information setting labeled “Allow use as tooltip”.

Power BI page formatting options showing the Page information section containing the page name and the "allow use as tooltip" setting.
The Allow use as tooltip setting is in the Page information section

For drillthrough pages, I believe the presence of a field in the Drill through field well on the page is what causes it to be flagged as a drill through page.

The Visualizations pane in Power BI showing a field populated in the drillthrough field pane.
The drill through fields are in the Visualizations pane when the page is selected.

Calling the set_active_page() and hide_tooltip_drillthrough_pages() functions changes the metadata for the report object, but we have to save the report changes back to the report in the target workspace, for the changes to take effect. This is why we call var_rptw.save_changes().

Once we save the changes, we get a response back that lists the changes made to the report.

🟢 The 'Main' page has been set as the active page in the 'DataSavvyReport2' report within the 'SempyLabsTest' workspace. 🟢 The 'Tooltip' page has been set to 'hidden' in the 'DataSavvyReport2' report within the 'SempyLabsTest' workspace. 🟢 The 'Drillthrough' page has been set to 'hidden' in the 'DataSavvyReport2' report within the 'SempyLabsTest' workspace. 🟢 The report definition has been updated successfully.
The output from calling save_changes() lists which page was set to active, which pages have been hidden, and a confirmation that the report definition was saved.

Calling list_pages() produces a pandas DataFrame with metadata for each page. We can refer to the Hidden, Active, Type, and Drillthough Target Page columns to confirm the desired changes.

As a final confirmation, we can also view the Power BI report from within the notebook. That is what I’m doing with the launch_report() function. It provides a read-only view of the report in the notebook cell output..

Power BI report embedded in a notebook

More posts about Semantic Link Labs

So far, I’ve been exploring the report and admin subpackages of Semantic Link Labs. Below are some other blog posts I’ve written about them.

Finding fields used in a Power BI report in PBIR format with Semantic Link Labs

Get Power BI Report Viewing History using Semantic Link Labs

Want to learn more about Power BI Automation with Semantic Link Labs? Join our webinar this month covering this exact topic.

About ProcureSQL

ProcureSQL is the industry leader in providing data architecture as a service, enabling companies to harness their data and grow their business. ProcureSQL is 100% onshore in the United States and supports the four quadrants of data, including application modernization, database management, data analytics, and data visualization. ProcureSQL works as a guide, mentor, leader, and implementer to provide innovative solutions to drive better business outcomes for all businesses. Click here to learn more about our service offerings.

Do you have questions about leveraging AI, Microsoft Fabric, or the Microsoft Data Platform? You can chat with us for free one-on-one, or contact the team. We would love to share our knowledge and experience with you.

Microsoft Fabric Mirroring changes the game with Data Ingestion giving you near real-time data with a no-code framework

Microsoft Fabric Mirroring changes the game with Data Ingestion, giving you near real-time data with a no-code framework.

Microsoft’s Fabric Mirroring will change how you perform data ingestion. If you are using products to automate batch processes for data dumping, did you know that Fabric Mirroring might remove the need for these tools and provide you with near real-time access to the data as it changes in the source systems?

Suppose you have not yet heard of the medallion architecture. In that case, it involves utilizing bronze, silver, and gold layers to describe the data processing processes from intake into your data hub to consumption from your reporting applications of choice. This multi-layered approach existed before I started my analytics career in the early 2000s. Think of it simply as bronze being your unprocessed data, silver being somewhat cleaned and organized data processed from your bronze layer, and gold being your aggregated and optimized data ready for prime-time business insights.

It’s essential to understand the evolution of data management. From the ’90s to the early 2000s, the process of getting data from each application (referred to as a spoke) into your data repository (data hub) was complex. In the Microsoft world, multiple SSIS packages or other processes were used to pull data into tables with varchar(max); this was typically a batch process that ran on a schedule, leading to potential issues.  There were so many SSIS packages that we needed an automation language to build them all, rather than doing them individually.

Many businesses’ analytics projects struggle to quickly integrate the correct data into their hub so that data transformations and validations can be effective. If you get this wrong, there is no point in collecting $200 and passing Go. Your data analytics project might end up going straight to jail.

How can we load data quickly and successfully?

I am introducing you to a no-code, near-real-time option for loading your data into your data lake (data hub) within Fabric. This new feature is known as Fabric Mirroring.

While I love the functionality of Fabric Mirroring, I am not a fan of the name. Many people with SQL Server experience think this is similar to Database Mirroring because these names are similar.

In my opinion, Fabric mirroring is similar to implementing Change Data Capture (CDC) on your SQL Server databases. CDC feeds data into a real-time streaming tool like Apache Kafka to copy data from your spoke (SQL Server application database) into your hub (Data Lake).

The benefit here is twofold. First, you don’t have to manage the Change Data Capture or Kafka implementations. Second, and most importantly, this is more than just an SQL Server solution. In the future, you can use Fabric Mirroring to ingest data from all your sources (spokes) into your data hub in near real-time, with minimal to no code required.

For example, here is how to use Fabric Mirroring to import Dynamics 365 or Power Apps data into Fabric. You can do the same for Azure Cosmos Database and Snowflake. SQL Server is coming soon.

Currently, the following databases are available:

Platform Near real-time replication Type of mirroring
Microsoft Fabric mirrored databases from Azure Cosmos DB (preview) Yes Database mirroring
Microsoft Fabric mirrored databases from Azure Databricks (preview) Yes Metadata mirroring
Microsoft Fabric mirrored databases from Azure Database for PostgreSQL flexible server (preview) Yes Database mirroring
Microsoft Fabric mirrored databases from Azure SQL Database Yes Database mirroring
Microsoft Fabric mirrored databases from Azure SQL Managed Instance (preview) Yes Database mirroring
Microsoft Fabric mirrored databases from Snowflake Yes Database mirroring
Microsoft Fabric mirrored databases from SQL Server (preview) Yes Database mirroring
Open mirrored databases Yes Open mirroring
Microsoft Fabric mirrored databases from Fabric SQL database (preview) Yes Database mirroring

Now I know I can use Fabric Mirroring to help me get near real-time data into my hub with no code required. Why else should Fabric Mirroring be a game-changer for my analytics projects?

The Fabric Mirror enables us to accomplish a lot more in less time.

Suppose you have an SLA for getting data into a data warehouse in 24 hours. Processing through all the layers took you 20 hours (12 hours into bronze, 6 hours from bronze to silver, and 6 hours from silver to gold). If you now had near real-time, say 90 seconds, to get changes into bronze, that gives you an extra 11 hours and 59 minutes to improve data quality, data validation, and other processes upstream.

Centralized Data Management

Having a single hub that the applications (spokes) automatically send data to, a centralized database, and the clients and tools used, eliminates the need to install additional software. You now transition from pulling data from the spokes with batch processing to pushing data from the spokes in near real-time. It also simplifies data governance and enhances security because combining this with Preview lets you see which spokes the data goes into.

For example, you must comply with GDPR, and Sarah in the UK has now requested that her data be removed. You can now easily find the data in the spokes from the hub to determine what data needs to be purged quickly.

Simplified Data Ingestion.

Instead of mixing and matching different data sources, your delta tables will be created across your Cosmos Databases, Azure SQL databases, Dynamics 365, and other future fabric mirroring sources. You no longer need to worry about which sources are in Excel, CSV, flat file, JSON, etc. They are all in the same format, ready for you to do your transformations, data validation, and apply any business rules required for your silver level.

Improved Query Performance

Those who know me know that I love discussing query performance tuning. I am passionate about making databases go just as fast as your favorite F1 race car. I also know that you have at least one group of people running reporting queries against your line-of-business application database or an availability group replica. This leads to increased locks that slow down the original purpose of your application databases. These locks are now removed, and these reports can be sent against your data hub.

The mirrored data is also stored in an analytics-ready format, such as delta tables, which enhances query performance across various tools within Microsoft Fabric, including Power BI.

What if you cannot use Fabric Mirroring?

The sources for Microsoft Fabric to date are limited. If I had on-premise data sources or other sources that are not ready for Fabric Mirroring, I would still encourage this architecture approach of using change data capture, where available, to lead to streaming your data into your data hub of choice.

About ProcureSQL

ProcureSQL is the industry leader in providing data architecture as a service, enabling companies to harness their data and grow their business. ProcureSQL is 100% onshore in the United States and supports the four quadrants of data, including application modernization, database management, data analytics, and data visualization. ProcureSQL serves as a guide, mentor, leader, and implementer, providing innovative solutions to drive better business outcomes for all businesses. Click here to learn more about our service offerings.

In this post, we will cover a new feature in preview for Microsoft Fabric: Lakehouse Schemas. If your Lakehouse is drowning in a sea of unorganized data, this might just be the lifeline you’ve been looking for.


The Data Chaos Conundrum

We’ve all been there. You’re knee-deep in a project, desperately searching for that one crucial dataset. It’s like trying to find your car keys in a messy apartment; frustrating and time-consuming. This is where Lakehouse Schemas come in, ready to declutter your data environment and give everything a proper home.


Lakehouse Schemas: Your Data’s New Best Friend

Think of Lakehouse Schemas as the Marie Kondo of the data world. They help you group your data in a way that just makes sense. It’s like having a super-organized filing cabinet for your data.

Why You’ll Love It

  • Logical Grouping: Arrange your data by department, project, or whatever makes the most sense for you and your team.
  • Clarity: No more data haystack. Everything has its place.

The Magic of Organized Data

Let’s break down how these schemas can transform your data game:

Navigate Like a Pro

Gone are the days of endless scrolling through tables. Prior to this feature, if your bronze layer had a lot of sources and thousands of tables, being able to find the exact table in your Lakehouse was a difficult task. With schemas, finding data just became a lot easier.

Pro Tip: In bronze, organize sources into schemas. In silver, organize customer data, sales data, and inventory data into separate schemas. Your team will thank you when they can zoom right to what they need.

Schema Shortcuts: Instant Access to Your Data Lake

Here’s where it gets even better. Schema Shortcuts allow you to create a direct link to a folder in your data lake. Say you have a folder structure on your data lake like silver/sales filled with hundreds of Delta tables, creating a schema shortcut to silver/sales automatically generates a sales schema. It then discovers and displays all the tables within that folder structure instantly.

Why It Matters: No need to manually register each table. The schema shortcut does the heavy lifting, bringing all your data into the Lakehouse schema with minimal effort.

Note: Table names with special characters may route to the Unidentified folder within your lakehouse, but they will still show up as expected in your SQL Analytics Endpoint.

Quick Tip: Use schema shortcuts to mirror your data lake’s organization in your Lakehouse. This keeps everything consistent and easy to navigate.

Manage Data Like a Boss

Ever tried to update security on multiple tables at once? With schemas, it’s a walk in the park.

Imagine This: Need to tweak security settings? Do it once at the schema level, and you’re done. No more table-by-table marathon.

Teamwork Makes the Dream Work

When everyone knows where everything is, collaboration becomes a whole lot smoother.

Real Talk: Clear organization means fewer “Where’s that data?” messages and more actual work getting done.

Data Lifecycle Management Made Easy

Keep your data fresh and relevant without the headache.

Smart Move: Create an Archive or Historical schema for old data and a “Current” schema for the hot stuff. It’s like spring cleaning for your data!

Take Control with Your Own Delta Tables

Managing your own Delta tables is added overhead, but gives you greater flexibility and control over your data, compared to relying on fully managed tables within Fabric.

The Benefits:

  • Customization: Tailor your tables to fit your specific needs without the constraints of Fabric managed tables.
  • Performance Optimization: Optimize storage and query performance by configuring settings that suit your data patterns. Be aware that you must maintain your own maintenance schedules for optimizations such as vacuum and v-order when managing your own tables.
  • Data Governance: Maintain full control over data versioning and access permissions.

Pro Tip: Use Delta tables in conjunction with schemas and schema shortcuts to create a robust and efficient data environment that you control from end to end.


Getting Started: Your Step-by-Step Guide

Ready to bring order to the chaos? Here’s how to get rolling with Lakehouse Schemas:

Create Your New Lakehouse

At the time of writing this, you can not enable custom schemas on existing Lakehouses. You must create a new Lakehouse and check the Lakehouse schemas checkbox. Having to redo your Lakehouse can be a bit of an undertaking if all of your delta tables are not well-organized, but getting your data tidied up will pay dividends in the long run.

Plan Your Attack

Sketch out how you want to organize things. By department? Project? Data type? You decide what works best for you and your team.

Create Your Schemas

Log into Microsoft Fabric, head to your Lakehouse, and start creating schemas. For folders in your data lake, create schema shortcuts to automatically populate schemas with your existing tables.

Example: Create a schema shortcut to silver/sales, and watch as your Lakehouse schema fills up with all your sales tables, no manual import needed.

Play Data Tetris

If you choose not to use schema shortcuts. you can move any tables into their new homes. It’s as easy as drag and drop. If you are using schema shortcuts, any shifting of schemas would occurs in the data lake location of your delta table in your data pipelines.

Manage Your Own Delta Tables

Consider creating and managing your own Delta tables for enhanced control. Store them in your data lake and link them via schema shortcuts.

Stay Flexible

As your needs change, don’t be afraid to shake things up. Add new schemas or schema shortcuts, rename old ones, or merge as needed.


Pro Tips for Schema Success

  • Name Game: Keep your schema names clear and consistent. Work with the business around naming as well to help prevent any confusion around what is what.
  • Leverage Schema Shortcuts: Link directly to your data lake folders to auto-populate schemas.
  • Document It: Document what goes where. Future you will be grateful, and so will your team.
  • Team Effort: Get everyone’s input on the structure. It’s their data home too.
  • Own Your Data: Manage your own Delta tables for maximum flexibility.
  • Stay Fresh: Regularly review and update your schemas setup and configuration.

The Big Picture

Organizing your data isn’t just about tidiness—it’s about setting yourself up for success.

  • Room to Grow: A well-planned schema system scales with your data.
  • Time is Money: Less time searching means more time for actual analysis.
  • Take Control: Managing your own Delta tables adds some overhead but also gives you the flexibility to optimize your data environment and more control.
  • Instant Access: Schema shortcuts bridge your data lake and Lakehouse seamlessly.
  • Roll with the Punches: Easily adapt to new business needs without a data meltdown.

Wrapping It Up

Microsoft Fabric’s Lakehouse Schemas and Schema Shortcuts are like a superhero cape for your Lakehouse environment. They bring order to chaos, boost team productivity, and make data management a breeze.

Remember:

  • Schemas create a clear roadmap for your data.
  • Schema shortcuts automatically populate schemas from your data lake folders.
  • Managing your own Delta tables gives you more control and efficiency.
  • Your team will work smarter, not harder.
  • Managing and updating data becomes way less of a headache.

So why not give the new Lakehouse Schemas feature a shot? Turn your data jungle into a well-organized garden and watch your productivity grow!


Happy data organizing!