Unleash the Power of Data Activator feature in Microsoft Fabric

Microsoft Fabric offers a powerful feature called Data Activator that allows users to seamlessly integrate and activate their data for enhanced insights and decision-making. This feature enables users to connect various data sources, such as databases, APIs, and cloud services, to Fabric's analytics platform. With Data Activator, users can easily transform raw data into actionable … Continue reading Unleash the Power of Data Activator feature in Microsoft Fabric

Microsoft Fabric: Create and load your data into the Lakehouse table

What is a LakeHouse? A Lakehouse is a data management architecture that is a combination of both data lakes and data warehouses. Before we jump into the definition of Lakehouse it is important to better understand the difference between Datalake and DataWarehouse individually. A Lakehouse allows you to use both the Datalake and DataWarehouse together … Continue reading Microsoft Fabric: Create and load your data into the Lakehouse table

Microsoft Fabric Terminologies

Following are the basic terminologies that are used inside Microsoft Fabric ecosystem. This has been referenced from official Fabric documentation to serve as a repo for all our future articles in Fabric. Generic Terms Capacity: It’s a dedicated set of resources available for use. It defines how much work a resource can handle. Different tasks … Continue reading Microsoft Fabric Terminologies

PowerBI Vs Cleanlab Studio: The Best Tool for your Data Cleansing needs

Dataset cleansing is an essential step in data analysis as it ensures your dataset's accuracy and consistency and helps remove its inconsistencies and errors. Using the dataset without proper data cleansing activity will result in improper value and wrong insights into the organizations’ data-driven decisions. In this blog post, we will see how to get … Continue reading PowerBI Vs Cleanlab Studio: The Best Tool for your Data Cleansing needs

Why should you Migrate from Azure Synapse Analytics to Microsoft Fabric

Microsoft Fabric is a cloud-based data platform that provides a range of services for data engineering, data science, and business intelligence. It is an extension of Azure Synapse Analytics that integrates all analytics workloads from the data engineer to the business knowledge worker. Fabric brings together Power BI, Data Factory, and the Data Lake, on … Continue reading Why should you Migrate from Azure Synapse Analytics to Microsoft Fabric

Reviewing the Built-in Roles available in the Azure Synapse Analytics

Azure Synapse Analytics has many built-in roles that will help to manage access to Synapse resources. These roles allow you to control what users and applications can do within a Synapse workspace. Synapse RBAC Roles can be assigned by Synapse Administrators. A workspace-level Synapse Administrator can grant access to any workspace. A lower-level Synapse administrator … Continue reading Reviewing the Built-in Roles available in the Azure Synapse Analytics

Workload Management in Azure Synapse Analytics

Managing varied workloads with proper resource allocation for multiple concurrent user environment is the biggest challenge a team might face when retrieving data from an azure synapse analytics dedicated sql pool db. Workload management in azure synapse analytics gives you access to control your workload that are utilizing your system resources. Setting up the best … Continue reading Workload Management in Azure Synapse Analytics

DWUs(Data Warehouse Units) in Synapse Dedicated Pool

Basically, there are two types of pools in Azure synapse analytics: Serverless SQL Pool and Dedicated SQL Pool. In serverless model as you might be aware that the costing is based on pay-per-usage model and calculated per TB or processing consumed on the queries that are run. Whereas the costing of Dedicated SQL pools is … Continue reading DWUs(Data Warehouse Units) in Synapse Dedicated Pool

Implementing Change Data Capture in Azure Data Factory

Change Data Capture (CDC): For any ETL requirement that involves huge amount of data, most of the problem is solved when you eliminate repeated or redundant process in your data storage mechanism. Basically, you should not repeat the work to copy or move the data that you have it already in your destination datastore. Hence … Continue reading Implementing Change Data Capture in Azure Data Factory

Change Data Capture In Azure Synapse Analytics & Data Factory

What is Change Data Capture? In data terminology Change Data Capture or simply called CDC is a method to track and pick only the data that has been changed from the last known point of time. CDC is a feature that was already available in the SQL Server for finding the changed records in a … Continue reading Change Data Capture In Azure Synapse Analytics & Data Factory

ADF | Delete files from Azure storage based on column value in Excel

In this article we are going to discuss about how to pick and delete only specific files from the ADLS storage container by passing filenames taken from a excel/csv file column value. File deletion: Recently I came across a requirement for file deletion in ADLS. Azure Data Factory’s delete activity is enough to complete this … Continue reading ADF | Delete files from Azure storage based on column value in Excel

Monitoring Azure Synapse Analytics Workloads Using DMVs

Introduction In this article we will look at Dynamic Management Views and how can we leverage them to monitor the workloads in an azure synapse analytics workload. We will learn this today with a practical use case and few examples focussing on synapse workload monitoring. Dynamic Management Views Dynamic Management View or simply called DMVs are nothing … Continue reading Monitoring Azure Synapse Analytics Workloads Using DMVs

Configure ADF Pipeline Output to a File

At an enterprise level, every project schedules and runs multiple Azure Data Factory pipelines but tracking their outcomes in ADF studio is a cumbersome process. There are companies who after for every failed pipeline activity with some error, they must track them down by drilling down each activity until they find the failed one and … Continue reading Configure ADF Pipeline Output to a File

Azure Synapse Security- Static Data Masking

Data security is hot topic given the data breach we hear about it every day. Though there are various specialized tools available in the market, multiple questions arise on their accessibility, Sharing and data transfers within the organization. Mostly in an organization there might be need to refresh(copy) production sensitive data to multiple nonproduction environments … Continue reading Azure Synapse Security- Static Data Masking

Azure Synapse Security- Dynamic Data Masking

Dynamic data masking is a feature that is available in Synapse analytics to restrict the exposure of sensitive data to the end users. We can configure data masking to hide sensitive data in the result sets that are queries by the users. Using data masking we can not only restrict also specify the amount of … Continue reading Azure Synapse Security- Dynamic Data Masking

Azure Synapse Analytics Link for SQL –Step by Step approach

This article provides a step-by-step guide for getting started with Azure Synapse Link for Azure SQL Database. I strongly recommend you go through my previous article which explains the basics of Synapse Link for SQL before proceeding with this (creating it) for better understanding. Configure Source Azure SQL Database Create a linked service to your … Continue reading Azure Synapse Analytics Link for SQL –Step by Step approach

The Basics of Azure Synapse Link for SQL

The newly released feature ‘Synapse link for SQL’, enables near real-time analytics into Azure Synapse analytics over operational data from both Azure SQL and SQL Server 2022. It provides seamless integration between the SQL database and Azure Synapse analytics. The rich feature it provides enables users to run analytics, machine learning or BI workloads on … Continue reading The Basics of Azure Synapse Link for SQL

Monitor Azure Synapse Analytics using Log Analytics

The log analytics will monitor the synapse pipelines and provide us more insights once if the job fails. The Azure Synapse integration with Log Analytics is particularly useful in the following scenarios: You want to write complex queries on a rich set of metrics that are published by Azure Synapse to Log Analytics. Custom alerts … Continue reading Monitor Azure Synapse Analytics using Log Analytics

Lake Database in Azure Synapse Analytics

Introduction: Azure synapse analytics provides standard database templates for various industries to use and create DB model as per their company needs. These are readymade templates which can be created with rich metadata for a clear understanding that can be implemented anytime with fewer steps. Database templates are in simple terms, business and technical data … Continue reading Lake Database in Azure Synapse Analytics

Create a Copy of Azure Data Factory using Azure ARM Templates

Introduction: In day to day operations we must have faced requirements to backup and restore or copy an Azure data factory from existing to new ones. In todays demo we will see how can we backup and restore the Azure data factory using ARM templates export/import option in azure data factory studio. Steps: I will … Continue reading Create a Copy of Azure Data Factory using Azure ARM Templates

Pause dedicated SQL pools with Azure Synapse Pipelines

Introduction: One of the main objective of any business that is using cloud services is to optimize resources and lower the on-going costs. Most of the organizations done need access to the data warehouse layer round the clock and they will be using reporting dashboards to view the information. In such scenarios it is best … Continue reading Pause dedicated SQL pools with Azure Synapse Pipelines

Parameterization using Notebooks in Azure Synapse Analytics

Introduction: Parameterization is very useful when you want a reusable code that you can use forever and get the output by executing it only by changing the parameter for all your future requirements. Traditionally while coding you will declare variables which are static(see image below) but with parameterization you can use dynamic parameters all through … Continue reading Parameterization using Notebooks in Azure Synapse Analytics

Create Synapse Notebook and run Python and SQL under Spark Pool

In this article we will look into how could we run both Python and SparkSQL queries in a single notebook workspace under the built-in Apache Spark Pools to transform the data in a single window. Introduction:In Azure synapse analytics, a notebook is where you can write live code, visualize and also comment text on them. … Continue reading Create Synapse Notebook and run Python and SQL under Spark Pool

Extract file names and copy from source path in Azure Data Factory

We are going to see a real-time scenario on how to extract the file names from a source path and then use them for any subsequent activity based on its output. This might be useful in cases where we have to extract file names, transform or copy data from csv, excel or flat files from … Continue reading Extract file names and copy from source path in Azure Data Factory

CETAS (Creating External Table as Select) in Azure Synapse Analytics

Introduction: In this post we will discuss on how to create an external table and to store the data inside your specified azure storage parallelly using TSQL statements. What is CETAS: CETAS or ‘Create External Table as Select’ can be used with both Dedicated SQL Pool and Serverless SQL Pool to create an external table … Continue reading CETAS (Creating External Table as Select) in Azure Synapse Analytics

Parameterize Pipelines and Datasets in Azure Data Factory with Demo

Introduction: In continuation to our previous article, we will look at how could we use parameterization into datasets and pipelines. We will also implement a pipeline with a simple copy activity to see how and where we can implement parameters in azure data factory. Consider a scenario where you want to run numerous pipelines with … Continue reading Parameterize Pipelines and Datasets in Azure Data Factory with Demo

Create External DataSource in Azure Synapse Analytics

Today we will check how to create an external data source to access data stored in other resources. If you could remember, in one of our previous articles we have discussed that there will be a Logical Data Warehouse (LDW) which will work similar to a database that you could see in azure synapse analytics. … Continue reading Create External DataSource in Azure Synapse Analytics

Distributions in Azure Synapse Analytics

In continuation to our previous article on Azure Synapse Analytics, we will deep dive into the sharding patterns(distributions) that are used in the Dedicated SQL Pool. In the background, the Dedicate SQL Pool divides a work into 60 smaller queries which will be run in parallel on your compute node. You will define the distribution … Continue reading Distributions in Azure Synapse Analytics

Filter real-time error rows from CSV to SQL Database Table in Azure Data Factory – Part Two

**This is a continuation of part one, I suggest you to check that first to get a clear understanding** Once the first condition is completed let’s check the second which I named as ValidRows as it is going to capture only the non-error values. Compared to the first condition this is very simple as we … Continue reading Filter real-time error rows from CSV to SQL Database Table in Azure Data Factory – Part Two

Filter real-time error rows from CSV to SQL Database Table in Azure Data Factory – Part one

Azure Data Factory is a tool with tremendous capabilities when it comes to ETL operations. It has many features that would help the users to cure and transform the data that we load into it. The developers or the users face many real-time issues when performing their ETL operations one such common yet unavoidable scenario … Continue reading Filter real-time error rows from CSV to SQL Database Table in Azure Data Factory – Part one

Azure Synapse Analytics Architecture

Azure Synapse SQL is a technology which resides inside the Synapse workspace. Totally we have two pools which we have discussed in detail in one of our articles few weeks ago. Dedicated SQL PoolServerless SQL Pool The built-in ‘Serverless SQL Pool’ gets created automatically when you create the workspace and the ‘Dedicated SQL Pool’ is … Continue reading Azure Synapse Analytics Architecture

Integrate Pipelines with Azure Synapse Analytics

In line with our previous articles, today we will see how to create, schedule and monitor a pipeline in synapse using synapse analytics studio. Pipeline is ETL with workflow where we will execute and extract the results. A pipeline can be a single or group of activities to be run.Activity is a task to implement … Continue reading Integrate Pipelines with Azure Synapse Analytics

Analyze data with Spark Pool in Azure Synapse Analytics – Part 2

This article is a continuation from Part1 which I posted earlier. I strongly recommend you to go through part 1 before you go through this article. The demo we are going to see will use apache Spark serverless pool model where we will be loading a parquet sample data file into spark database (yes, we … Continue reading Analyze data with Spark Pool in Azure Synapse Analytics – Part 2

Analyze data with Spark Pool in Azure Synapse Analytics – Part 1

This is the part one article of the two part series with demo which explains analyzing data with spark pool in azure synapse analytics. Since the topic touches apache spark heavily, I have decided to write a dedicated article to explain apache spark in azure -hence this part one. Pls make sure to read the … Continue reading Analyze data with Spark Pool in Azure Synapse Analytics – Part 1

Query CSV File Saved In ADLS Through SQL Query – Azure Synapse Analytics

We are all aware that SQL is commonly used to query structured data but in Synapse Analytics we can use SQL to query unstructured data saved in files like csv, parquet etc., using OPENROWSET function and it is one of the many features that can be done using synapse analytics. In this week’s article we … Continue reading Query CSV File Saved In ADLS Through SQL Query – Azure Synapse Analytics

What does Azure Synapse Analytics do?

Azure Synapse Analytics is a single solution for all data needs like ingesting, processing, and serving the data. It delivers unified experience of data integration, data warehousing and big data analytics in a single workspace environment. Azure Synapse analytics can be easily integrated with other services provided by azure like Azure Machine Learning, CosmosDB and … Continue reading What does Azure Synapse Analytics do?

Triggers in Azure Data Factory

Introduction: In this blog, we will look into Azure Data Factory Triggers which is an important feature to scheduling the pipeline to run without manual intervention each time. Apart from regular advantage to schedule the pipeline for future runs (which is very common), the azure data factory trigger has a special feature to pick and process data from … Continue reading Triggers in Azure Data Factory

Parameterization in Azure Data Factory Linked Services

Introduction: The linked services in azure data factory have the option to parameterize and pass dynamic values at run time. There might be requirement where we want to connect different databases from the same logical server or different database servers itself. Traditionally we would create separate linked services for each database or database servers but … Continue reading Parameterization in Azure Data Factory Linked Services

How to Copy Files Using Azure Data Factory Pipeline

Introduction In this article we will look at our first hands-on exercise in Azure Data Factory by carrying out simple file copy from our local to blob storage. The steps has been given below with explanation and screenshots. Create a storage account After creating storage account, create container which will hold the data that we … Continue reading How to Copy Files Using Azure Data Factory Pipeline

Real time twitter analysis with Azure Stream Analytics and saving the results in to Azure blob storage

Introduction A lot of consumer data is being posted on social media every minute and social media analysis has become a critical component in audience analysis, competitive research, and product research. Social media analytics and its tools are helping organizations around the world understand currently trending topics. Trending topics are those subjects and attitudes that … Continue reading Real time twitter analysis with Azure Stream Analytics and saving the results in to Azure blob storage

Send or receive events from Azure Event Hub using Python

This article is an quickstart demo of how one can send or receive events from Azure Event Hub using python script. If you are new to Event Hubs please check my previous post which explains the basics before you continue. We will be using two python scripts, 'send.py' and 'recv.py' for sending and receiving test … Continue reading Send or receive events from Azure Event Hub using Python

Azure Stream Analytics

Azure Stream Analytics is a fully managed PaaS (Platform-as-a-Service) and a real-time streaming service provided by Microsoft. It consists of a complex event processing engine designed to analyze and process vast volumes of real-time data like stock trading, credit card fraud detection, Web clickstream analysis, social media feeds & other applications. For quicker analysis of … Continue reading Azure Stream Analytics