October 2024

Machine Learning for Operational Efficiency

Every year, hundreds of billions of packages are shipped around the world. And in 2023 alone, that number broke the record at 356 billion shipped worldwide.

That’s a 100% increase since 2016. The industry will only get bigger and more complex, as estimates show that by 2028, nearly 500 billion packages will be shipped globally.

As supply chains become more bloated with delivery demand, they become more complex, multiplying efficiency problems.

This is where Machine Learning and Artificial Intelligence (AI) step in.

If you are not using Machine Learning, you are falling behind.

Companies worldwide, including many Fortune 500 names, have turned to Machine Learning and Artificial Intelligence to fine-tune their logistics and, most importantly, increase revenue and reduce costs.

When shipping out millions of packages per year, saving 5% in cost at every step is a massive boost to profits.

The numbers don’t lie either. Companies have already shown that AI is helping them meet their revenue goals. According to McKinsey, 61% of manufacturing executives report decreased costs, and 53% report increased revenues as a direct result of introducing AI into their supply chain.

Three Common Machine Learning Models

Several kinds of Machine Learning models have been designed to tackle different problems. Today, I will focus on three types of modeling systems: Time Series Models, Regression Models, and Classification Models.

Each offers something unique, and all can lead to efficiency improvements in the supply chain in different ways…

Time Series Forecasting Models

Time series forecasting is as simple as it gets when you want to improve efficiency with Machine Learning. The model takes historical data such as sales, seasonality, and trends and uses it to make predictions for the future.

For instance, let’s say you’re in the clothing apparel business and need to stock your warehouses full of winter gear for the upcoming season. Time Series forecasting would allow you to look at previous years’ sales and show you exactly when to stock winter gear for the upcoming cold months. This means you won’t stock too much wrong gear too early and take up precious space.

Likewise, it can keep you prepared for any major spikes in a single item based on sales trends. If you notice that in November, sales of winter apparel increase by 40%, you can ensure you aren’t caught off guard and avoid the risk of being understocked.

In doing so, you also keep customers happy, knowing they can always get the item they need. Plus, you’re meeting the demand of what they will buy. Very few are buying sandals in January. But if you find out that data shows that it’s still 10% of your sales, you might as well keep selling them and not lose out on that profit.

Regression Models

A regression model is an AI tool that finds the relationships between two variables and how one affects the other.

In our supply chains, regression models are well-equipped to aid in predicting delivery times of goods. The reason is that there are so many variables that go into transportation such as distance, weather conditions, and traffic.

Machine Learning and AI can look at the historical data of all of these variables and give you a leg up on the proper route to send your material from point A to point B. If you can avoid traffic, reroute around bad weather, and fully optimize routes taken, then you save on gas and time — reducing overall costs in the process.

Much like time series forecasting, regression models also allow you to predict demand on how big of a staff you need. If you’re in charge of a warehouse, for instance, variables such as the number of orders, seasonality, and even the day of the week affect exactly how many staff you need to pick items, box up, and ship out orders.

Furthermore, these factors affect how large of a fleet you need to be able to deliver your items. Having two trucks does no good when your demand requires eight. Likewise, having 100 workers at a warehouse doesn’t make sense when you only need 20 working that day.

Classification Models

Finally, let’s talk about classification models — AI tools designed to factor in many different variables and allow you to assess a wide variety of things from risk assessment to how to market to certain customers.

For instance, when it comes to risk assessment, you could use a classification model to let you know which items are particularly prone to getting damaged. Factors that you could put in are the packaging used to handle the item, travel conditions (if it is going to be on a bumpy road or in extreme temperatures), and distance (how many different handlers are going to touch it).

If you know all of these factors ahead of time, and allow AI to assess it as a high or low-risk item, then you can take precautions so it doesn’t arrive damaged. You can beef up the packaging, adhere to strict orders, or arrange for special transportation to ensure that nothing goes wrong.

On top of that, when delivering items, you can use classification models to determine if a package is at a high or low risk of arriving late. Factoring in traffic, weather, distance, holiday seasonality, and package weight all are factors that you can use to give an estimate of when something can be delivered. This keeps customers and clients happier, as they’ll have a better understanding of when exactly they will receive their items.

Finally, you can even group your customers into different categories to give you a better idea of who to target with internal marketing. For instance, those with high spending habits or those that always pick next-day delivery would be more worth your while to target with marketing efforts than those that spend less or select standard delivery.

Machine Learning Can Improve Your Business Today

As we can see, the different Machine Learning and AI models out there offer a huge variety of insight across the board when it comes to supply chains.

The best part is that the examples I mentioned are only the tip of the iceberg when it comes to providing solutions to supply chains and logistics.

Machine Learning and AI have changed the game from inventory management to quality control to delivery efficiency and more. Its ability to fine-tune processes and optimize efficiency gives companies a leg up on their competitors. And every year, AI and Machine learning models get better and better.

With companies from Walmart to Nike to FedEx and more adopting Machine Learning and AI into their supply chains, it only makes sense that other companies mimic their success and do the same.

October 21, 2024/0 Comments/by Kon Melamud

Fabric Lakehouse Schemas: Your Data’s New Best Friend

Cloud, Data Architecture

In this post, we will cover a new feature in preview for Microsoft Fabric: Lakehouse Schemas. If your Lakehouse is drowning in a sea of unorganized data, this might just be the lifeline you’ve been looking for.

The Data Chaos Conundrum

We’ve all been there. You’re knee-deep in a project, desperately searching for that one crucial dataset. It’s like trying to find your car keys in a messy apartment; frustrating and time-consuming. This is where Lakehouse Schemas come in, ready to declutter your data environment and give everything a proper home.

Lakehouse Schemas: Your Data’s New Best Friend

Think of Lakehouse Schemas as the Marie Kondo of the data world. They help you group your data in a way that just makes sense. It’s like having a super-organized filing cabinet for your data.

Why You’ll Love It

Logical Grouping: Arrange your data by department, project, or whatever makes the most sense for you and your team.
Clarity: No more data haystack. Everything has its place.

The Magic of Organized Data

Let’s break down how these schemas can transform your data game:

Navigate Like a Pro

Gone are the days of endless scrolling through tables. Prior to this feature, if your bronze layer had a lot of sources and thousands of tables, being able to find the exact table in your Lakehouse was a difficult task. With schemas, finding data just became a lot easier.

Pro Tip: In bronze, organize sources into schemas. In silver, organize customer data, sales data, and inventory data into separate schemas. Your team will thank you when they can zoom right to what they need.

Schema Shortcuts: Instant Access to Your Data Lake

Here’s where it gets even better. Schema Shortcuts allow you to create a direct link to a folder in your data lake. Say you have a folder structure on your data lake like silver/sales filled with hundreds of Delta tables, creating a schema shortcut to silver/sales automatically generates a sales schema. It then discovers and displays all the tables within that folder structure instantly.

Why It Matters: No need to manually register each table. The schema shortcut does the heavy lifting, bringing all your data into the Lakehouse schema with minimal effort.

Note: Table names with special characters may route to the Unidentified folder within your lakehouse, but they will still show up as expected in your SQL Analytics Endpoint.

Quick Tip: Use schema shortcuts to mirror your data lake’s organization in your Lakehouse. This keeps everything consistent and easy to navigate.

Manage Data Like a Boss

Ever tried to update security on multiple tables at once? With schemas, it’s a walk in the park.

Imagine This: Need to tweak security settings? Do it once at the schema level, and you’re done. No more table-by-table marathon.

Teamwork Makes the Dream Work

When everyone knows where everything is, collaboration becomes a whole lot smoother.

Real Talk: Clear organization means fewer “Where’s that data?” messages and more actual work getting done.

Data Lifecycle Management Made Easy

Keep your data fresh and relevant without the headache.

Smart Move: Create an Archive or Historical schema for old data and a “Current” schema for the hot stuff. It’s like spring cleaning for your data!

Take Control with Your Own Delta Tables

Managing your own Delta tables is added overhead, but gives you greater flexibility and control over your data, compared to relying on fully managed tables within Fabric.

The Benefits:

Customization: Tailor your tables to fit your specific needs without the constraints of Fabric managed tables.
Performance Optimization: Optimize storage and query performance by configuring settings that suit your data patterns. Be aware that you must maintain your own maintenance schedules for optimizations such as vacuum and v-order when managing your own tables.
Data Governance: Maintain full control over data versioning and access permissions.

Pro Tip: Use Delta tables in conjunction with schemas and schema shortcuts to create a robust and efficient data environment that you control from end to end.

Getting Started: Your Step-by-Step Guide

Ready to bring order to the chaos? Here’s how to get rolling with Lakehouse Schemas:

Create Your New Lakehouse

At the time of writing this, you can not enable custom schemas on existing Lakehouses. You must create a new Lakehouse and check the Lakehouse schemas checkbox. Having to redo your Lakehouse can be a bit of an undertaking if all of your delta tables are not well-organized, but getting your data tidied up will pay dividends in the long run.

Plan Your Attack

Sketch out how you want to organize things. By department? Project? Data type? You decide what works best for you and your team.

Create Your Schemas

Log into Microsoft Fabric, head to your Lakehouse, and start creating schemas. For folders in your data lake, create schema shortcuts to automatically populate schemas with your existing tables.

Example: Create a schema shortcut to silver/sales, and watch as your Lakehouse schema fills up with all your sales tables, no manual import needed.

Play Data Tetris

If you choose not to use schema shortcuts. you can move any tables into their new homes. It’s as easy as drag and drop. If you are using schema shortcuts, any shifting of schemas would occurs in the data lake location of your delta table in your data pipelines.

Manage Your Own Delta Tables

Consider creating and managing your own Delta tables for enhanced control. Store them in your data lake and link them via schema shortcuts.

Stay Flexible

As your needs change, don’t be afraid to shake things up. Add new schemas or schema shortcuts, rename old ones, or merge as needed.

Pro Tips for Schema Success

Name Game: Keep your schema names clear and consistent. Work with the business around naming as well to help prevent any confusion around what is what.
Leverage Schema Shortcuts: Link directly to your data lake folders to auto-populate schemas.
Document It: Document what goes where. Future you will be grateful, and so will your team.
Team Effort: Get everyone’s input on the structure. It’s their data home too.
Own Your Data: Manage your own Delta tables for maximum flexibility.
Stay Fresh: Regularly review and update your schemas setup and configuration.

The Big Picture

Organizing your data isn’t just about tidiness—it’s about setting yourself up for success.

Room to Grow: A well-planned schema system scales with your data.
Time is Money: Less time searching means more time for actual analysis.
Take Control: Managing your own Delta tables adds some overhead but also gives you the flexibility to optimize your data environment and more control.
Instant Access: Schema shortcuts bridge your data lake and Lakehouse seamlessly.
Roll with the Punches: Easily adapt to new business needs without a data meltdown.

Wrapping It Up

Microsoft Fabric’s Lakehouse Schemas and Schema Shortcuts are like a superhero cape for your Lakehouse environment. They bring order to chaos, boost team productivity, and make data management a breeze.

Remember:

Schemas create a clear roadmap for your data.
Schema shortcuts automatically populate schemas from your data lake folders.
Managing your own Delta tables gives you more control and efficiency.
Your team will work smarter, not harder.
Managing and updating data becomes way less of a headache.

So why not give the new Lakehouse Schemas feature a shot? Turn your data jungle into a well-organized garden and watch your productivity grow!

Happy data organizing!

October 10, 2024/0 Comments/by Justin Cunningham

SQL Server Performance Tuning: No Consultant Required

Data Architecture

Are you tired of people complaining about your database’s slow performance? Maybe you are tired of looking at the black box and hoping to find what is slowing your applications down. Hopefully, you are not just turning your server off and back on when people say performance is terrible. Regardless, I want to share a quick SQL Server performance-tuning checklist that Informational Technology Professionals can follow to speed up their SQL Server instances before procuring a performance-tuning consultant.

Bad SQL Server performance usually results from bad queries, blocking, slow disks, or a lack of memory. Benchmarking your workload is the quickest way to determine if performance worsens before end users notice it. Today, we will cover some basics of why good queries can go wrong, the easiest way to resolve blocking, and how to detect if slow disks or memory is your problem.

The Power of Up-to-Date Statistics: Boosting SQL Server Performance

If you come across some outdated SQL Server performance tuning recommendations, they might focus on Reorganizing and Rebuilding Indexes. Disk technology has evolved a lot. We no longer use spinning disks that greatly benefit from sequential rather than random reads. On the other hand, ensuring you have good, updated statistics is the most critical maintenance task you can do regularly to improve performance.

SQL Server uses statistics to estimate how many rows are included or filtered from the results you get when you add filters to your queries. Suppose your statistics are outdated or have a minimal sample rate percentage, which is typical for large tables. In both cases, your risk is high for inadequate or misleading statistics, making your queries slow. One proactive measure to prevent performance issues is to have a maintenance plan to update your statistics on a regular schedule. Updating your stats with a regularly scheduled maintenance plan is a great start.

For your very large tables, you will want to include a sample rate. This guarantees an optimal percentage of rows is sampled when update statistics occur. The default sample rate percentage goes down as your table row count grows. I recommend starting with twenty percent for large tables while you benchmark and adjust as needed—more on benchmarking later in this post.

Memory Matters: Optimizing SQL Server Buffer Pool Usage

All relational database engines thrive on caching data in memory. Most business applications do more reading activity than writing activity. One of the easiest things you can do in an on-premise environment is add memory and adjust the max memory setting of your instance, allowing more execution plans and data pages to be cached in memory for reuse.

If you are in the cloud, it might be a good time to double-check and ensure you use the best compute family for your databases. SQL Server’s licensing model is per core; all cloud providers pass the buck to their customers. You could benefit from a memory-optimized SKU with fewer CPU cores and more RAM. Instead of paying extra for CPU cores, you don’t need so you can procure the required RAM.

Ideally, we would be using the enterprise edition of SQL Server. You would also have more RAM than the expected size of your databases. If you use the Standard edition, ensure you have more than 128GB of RAM (the maximum usage) for SQL Server Standard Edition. If you use the Enterprise edition of SQL Server, load your server with as much RAM as possible, as there is no limit to how much RAM can be utilized for caching your data.

While you might think this is expensive, it’s cheaper than paying a consultant to tell you you would benefit from having more RAM and then going and buying more RAM or sizing up your cloud computing.

Disks Optimization for SQL Server Performance

The less memory you have, the more stress your disks experience. This is why we prioritize focusing on memory before we concentrate on disk. With relational databases, we want to follow three metrics for disk activity. We will focus on the number of disk transfers (reads and writes) that occur per second, also known as IOPS. Throughput is also known as the total size of i/o per second. Finally, we focus on latency, the average time it takes to complete the i/o operations.

Ideally, you want your reads and writes to be as fast as possible. More disk transfers lead to increased latency. If your reads or writes latency is consistently above 50ms, your I/O is slowing you down. You must benchmark your disk activity to know when you reach your storage limits. You either need to improve your storage or reduce the I/O consumption.

Read Committed Snapshot Isolation: The Easy Button to Reduce Blocking Issues

Does your database suffer from excessive blocking? By default, SQL Server uses pessimistic locking, which means readers and writers will block writers. Read Committed Snapshot Isolation (RCSI) is optimistic locking, which reduces the chance that a read operation will block other transactions. Queries that change data block other queries trying to change the same data.

Utilizing RCSI is an way to improve your SQL Server performance when you queries are slow due to blocking. RCSI works by leveraging tempdb to store row versioning. Any transaction that changes data stores the old row version in an area of tempdb called the version store. If you have disk issues with tempdb, this could add more stress. which is why we want to focus on disk optimizations before implementing RCSI.

You Are Not Alone: Ask For Help!

You are not alone. Procure SQL can help you with your SQL Server Performance Tuning questions.

If you have reached this step after following the recommendations for statistics, memory, disks, RCSI? If so, this is a great time to contact us or any other performance-tuning consultant. A qualified SQL Server performance tuning consultant could review your benchmarks with you and identify top offenders that could then be tuned with the possibility of indexes, code changes, and other various techniques.

If you need help, or just want to join our newsletter fill out the form below, and we will gladly walk with you on your journey to better SQL Server performance!

October 4, 2024/0 Comments/by JohnSterrett

Archive for month: October, 2024