79554799

Date: 2025-04-04 08:15:39
Score: 0.5
Natty:
Report link

Can anyone please provide me latest sample code to read and write csv from Azure ML Studio to CDL (Data Lake) Gen 2 using Azure Datastores in Python. Also please provide me the resources where i can read stuff regarding these changes in Gen 2 and Azure libraries which we need to install for the same for reading and writing csv from Azure ML Studio to CDL (Data Lake) Gen 2 using Azure Datastores in Python.

You can use the below code ​to read and write CSV files between Azure Machine Learning (Azure ML) Studio and Azure Data Lake Storage Gen2 using Azure Datastores in Python.

Register the Azure Data Lake Gen2 as a Datastore Code:

from azure.ai.ml import MLClient
from azure.ai.ml.entities import AzureDataLakeGen2Datastore
from azure.identity import DefaultAzureCredential

# Authenticate with Azure ML Workspace
credential = DefaultAzureCredential()

ml_client = MLClient(
    credential=credential,
    subscription_id="xxxx",
    resource_group_name="xxx",
    workspace_name="xx"
)
# Define the ADLS Gen2 Datastore
datastore = AzureDataLakeGen2Datastore(
    name="sampledatastore",
    account_name="xxx",
    filesystem="xxx",
)

# Register the Datastore
ml_client.datastores.create_or_update(datastore)

print("Datastore registered successfully!")

To read a CSV file from the datastore:

Code:

import pandas as pd

# Define path to the CSV file in ADLS Gen2
csv_path = "azureml://subscriptions/xxxxx/resourcegroups/vexxxx/workspaces/xxxx/datastores/xxxx/paths/003.csv"
df = pd.read_csv(csv_path)
print(df.head())

Output:

 CATEGORY    TIME                   INDICATOR  \
0  Rankings  2016.0                         NaN   
1       NaN     NaN      Health Outcomes - Rank   
2       NaN     NaN  Health Outcomes - Quartile   
3       NaN     NaN       Health Factors - Rank   
4       NaN     NaN   Health Factors - Quartile   

enter image description here

To write a CSV file from the datastore:

Code:

df = pd.DataFrame({
    "Name": ["Alice", "Bob"],
    "Score": [90, 85]
})

# Save to local temporary CSV file
df.to_csv("sample.csv", index=False)

# Upload it to the datastore
data_asset = Data(
    path="sample.csv",
    type=AssetTypes.URI_FILE,
    name="csv-upload",
    description="Sample CSV upload to Data Lake",
    datastore=datastore.name
)

uploaded_data = ml_client.data.create_or_update(data_asset)
print("CSV uploaded to:", uploaded_data.path)

Output:

Uploading sample.csv (< 1 MB): 27.0B [00:00, 78.8B/s]
CSV uploaded to: azureml://subscriptionsxxx/resourcegroups/xx/workspaces/xxce/datastores/xxx/paths/LocalUpload/99xxx4/sample.csv

enter image description here

Reference: Use datastores - Azure Machine Learning | Microsoft Learn

Reasons:
  • Whitelisted phrase (-1.5): You can use
  • RegEx Blacklisted phrase (2.5): Can anyone please provide me
  • RegEx Blacklisted phrase (2.5): please provide me
  • Long answer (-1):
  • Has code block (-0.5):
  • Starts with a question (0.5): Can anyone please
  • High reputation (-2):
Posted by: Venkatesan