Welcome to my blog on Microsoft Fabric! Whether you're just starting out or considering it for your organisation, this blog will briefly summarise what Fabric is all about.
I will explain the key features and new concepts to help you decide if Fabric is the right choice for your team. Think of it as taking your first steps into a new pool - we will start by getting our feet wet, then gradually dive deeper into what matters most for your business.
Microsoft Fabric has been with us for a year now, and like any new technology, it's growing and changing. While it's still developing, there's already so much to explore. Together, we'll scratch the surface and then dig deeper into what Fabric can do for your organisation.
By the end of this blog, you will have a clear roadmap for:
- Understanding what Fabric is and isn't
- Deciding if it's the right fit for your needs
- Planning a successful rollout in your organisation
What is Microsoft Fabric?
Microsoft Fabric is an end-to-end analytics and data platform designed for enterprises that require a unified solution. It encompasses data movement, processing, ingestion, transformation, real-time event routing, and report building. It offers a comprehensive suite of services including Data Engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases.
With Fabric, you don't need to assemble different services from multiple vendors. Instead, it offers a seamlessly integrated, user-friendly platform that simplifies your analytics requirements. Operating on a Software as a Service (SaaS) model, Fabric brings simplicity and integration to your solutions.
Microsoft Fabric integrates separate components into a cohesive stack. Instead of relying on different databases or data warehouses, you can centralise data storage with OneLake. AI capabilities are seamlessly embedded within Fabric, eliminating the need for manual integration. With Fabric, you can easily transition your raw data into actionable insights for business users.
There are eight main components of Fabric. Let's explore!
1) Power BI
Microsoft Power BI is a data visualisation platform used primarily for business intelligence purposes. Power BI stands for Power Business Intelligence and refers to a set of software tools and connectors that help you transform data from multiple sources into actionable insights.
For more details, refer to Microsoft documentation here. Also, check out my previous blogs Introduction to Power BI and Power BI Visuals.
2) Data Factory
Data Factory is an ETL-in-cloud solution that is the data integration workload of Microsoft Fabric. Data Factory is not a new product or service; it comes from many years of Microsoft data transformation tools and services. It is built on top of Power Query and Azure Data Factory. Microsoft documentation for Azure Data Factory can be found here and here’s the link to quick learning path.
Microsoft Fabric's Data Factory has two powerful tools that work together for your ETL (Extract, Transform, Load) needs:
1) Dataflows
- Handles the core data processing work
- Gets data from your sources
- Transforms it to meet your needs
- Loads it to your target location
2) Data pipelines
- Acts like a conductor, orchestrating the whole process
- Controls when and how things run
- Manages the flow of activities
- Coordinates different tasks and dependencies
3) Data Activator
- Data Activator is a no-code experience in Microsoft Fabric for automatically taking actions when patterns or conditions are detected in changing data.
- It monitors data in Power BI reports and Eventstream items, for when the data hits certain thresholds or matches other patterns.
- It then automatically takes appropriate action such as alerting users or kicking off Power Automate workflows.
The example below shows data for the package delivery events stream. These events show the real-time status of packages that are in the process of being delivered.
Here, we created a reflex to monitor the temperature. If the temperature becomes greater than 50, it creates an alert.
4) Industry Solution
Microsoft Fabric offers industry-specific data solutions that provide a robust platform for data management, analytics, and decision-making. These data solutions address the unique challenges faced by different industries, enabling businesses to optimise operations, integrate data from different sources, and use rich analytics.
There are currently four solutions available, most of them are in preview stage. Microsoft documentation can be found here.
5) Real-Time Intelligence
Real-Time Intelligence offers an end-to-end solution for event-driven scenarios, streaming data, and data logs. Whether dealing with gigabytes or petabytes, all organisational data in motion converges in the Real-Time Hub.
It connects time-based data from various sources using no-code connectors, enabling immediate visual insights, geospatial analysis, and trigger-based reactions that are all part of an organisation-wide data catalog.
6) Synapse Data Engineering
Synapse Data Engineering is a Microsoft Fabric service that enables large-scale data engineering and analytics. It provides comprehensive tools for processing, storing, and transforming data, allowing for easy integration of various data sources.
7) Synapse Data Science
Microsoft Fabric offers Data Science experiences to empower users to complete end-to-end data science workflows for the purpose of data enrichment and business insights. You can complete a wide range of activities across the entire data science process, all the way from data exploration, preparation and cleansing to experimentation, modeling, model scoring and serving of predictive insights to BI reports.
Core Capabilities of Synapse Data science in Microsoft Fabric are,
- Complete workflow from data exploration to model deployment
- Code in Python or R, plus easy-to-use tools like Data Wrangler
- Team collaboration in one unified platform
- Run ML models on your data for powerful insights
- Enrich data within your analytics workflow
- Unlock predictive insights for better decision-making
Data Wrangler
Data Wrangler in Microsoft Fabric provides a graphical experience where you can easily generate code for exploration and pre-processing purposes and ensures that your data is in the best possible shape before it’s used to train a machine-learning model.
8) Synapse Data Warehouse
The lake-centric warehouse is built on an enterprise grade distributed processing engine that enables industry leading performance at scale while minimising the need for configuration and management. Living in the data lake and designed to natively support open data formats, Fabric data warehouse enables seamless collaboration between data engineers and business users without compromising security or governance.
The easy-to-use SaaS experience is also tightly integrated with Power BI for easy analysis and reporting, converging the world of data lakes and warehouses and greatly simplifying an organisations investment in their analytics estate.
So, now you will have a question,
Warehouse or Lakehouse?
- Choose a Data Warehouse when you need an enterprise-scale solution with an open standard format, no knobs performance, and minimal setup. Best suited for semi-structured and structured data formats, the data warehouse is suitable for both beginner and experienced data professionals, offering simple and intuitive experiences.
- Choose a Lakehouse when you need a large repository of highly unstructured data from diverse sources, leveraging low-cost object storage and want to use SPARK as your primary development tool. Acting as a 'lightweight' data warehouse, you always have the option to use the SQL endpoint and T-SQL tools to deliver reporting and data intelligence scenarios in your Lakehouse.
When you create and configure a new Lakehouse in the Data Engineering workload. Each Lakehouse produces three named items in the Fabric-enabled workspace.
- Lakehouse is the Lakehouse storage and metadata, where you interact with files, folders, and table data.
- Semantic model (default) is an automatically created semantic model based on the tables in the Lakehouse. Power BI reports can be built from the semantic model.
- SQL analytics endpoint is a read-only SQL analytics endpoint through which you can connect and query data with Transact-SQL.
When you create a Warehouse or Synapse Data Warehouse in a Microsoft Fabric workspace, it will create only two items.
- Warehouse: In the Type column, you will be able to see Warehouse (see screenshot below)
- Semantic model (default)
One of the other terms you will hear in Fabric is KQL. So...
What is KQL?
- Kusto Query Language (KQL) originated in Microsoft's Israeli R&D center as a grassroots incubation project in 2014. The project's internal code name was "Kusto", named after French undersea explorer Jacques Cousteau. The name was chosen as a reference to "exploring the ocean of data".
- Microsoft developed KQL to provide a more intuitive and powerful querying capability than SQL. The language was originally developed for Azure Data Explorer, but is now used across various services, including Azure Monitor, Azure Security Center, and Azure Application Insights.
- KQL is optimised for searching through large amounts of data in a cloud environment.
- KQL is no different than regular SQL. Think of KQL as SQL's close cousin. It is written in a format SQL executes the query. Below, screenshot will explain better.
In Real-Time Intelligence, you interact with your data in the context of eventhouses, databases, and tables. A single workspace can hold multiple Eventhouses, an eventhouse can hold multiple databases, and each database can hold multiple tables.
Once your KQL database has data, you can proceed to query your data using Kusto Query Language in a KQL queryset.
My recommendations:
Use OneLake:
OneLake is a single, unified, logical data lake for your whole organisation. A data Lake processes large volumes of data from various sources. Like OneDrive, One Lake comes automatically with every Microsoft Fabric tenant and is designed to be the single place for all your analytics data. OneLake brings customers:
- One datalake for the entire organisation
- One copy of data for use with multiple analytical engines. You can use Shortcut. A shortcut is a reference to data stored in other file locations. These file locations can be within the same workspace or across different workspaces, within OneLake or external to OneLake in ADLS, S3, or Dataverse —with more target locations coming soon. No matter the location, shortcuts make files and folders look like you have them stored locally.
Please note, while Fabric works great at reading data from platforms like Snowflake or AWS Redshift, it's not designed to push data into them. For best performance, pick one platform (whether it's Fabric or another) and use tools optimised for that platform.
Choosing Your Data Engine: SQL vs. Spark
For SQL Teams:
- Choose Fabric Warehouse if your team loves SQL
- Perfect for SQL Server experts who write queries and stored procedures
- Best for structured data and traditional database management
- Use for your ETL work with T-SQL
For Python Teams:
- Choose Fabric Lakehouse with Spark if your team prefers Python
- Ideal for less-structured data environments
- Works through Spark notebooks using PySpark (Python), SparkSQL, Scala
About Spark:
Spark is a powerhouse that is super fast and scalable. It's become the go-to choice in the big data world, with notebooks being the preferred way to move and process data.
Pick what matches your team's skills - there's no wrong choice! Every team and organisation prefer one or both. Make decision and embrace it early on.
Microsoft Fabric Adoption Strategy:
Step 1: Assess Your Organisation's Readiness
- Leadership & Strategy: Data culture, Business alignment
- Organisation & Structure: Governance, content ownership, Centre of Excellence (CoE)
- People & Skills: Community of practice, mentoring, user support
- Management & Control: Change management, System oversight
Step 2: Plan Your Journey
Remember: Start where you are, not where you wish you were!
If You're Just Starting:
- Take small steps first - crawl before you walk
- Focus on smaller projects
- Address gaps before diving deeper
If You're More Advanced:
- Take on medium-scale projects
- Progress from walking to running
- Build on existing strengths
Key Success Tips:
- Align all stakeholders first
- Create clear plans to address gaps
- Don't rush - shortcomings need active planning and leadership
- Match your project scale to your readiness level
Thank you for taking the time to read my thoughts on Microsoft Fabric! As a passionate Data Analyst with a love for data, I'm always excited to explore new tools. If you enjoyed this blog or have any questions, please reach out. I would love to connect! Keep an eye out for more insights and tips in the world of data. — Ayesha