To retrive the SAP data, you need to create SAP Odata Glue connector first.
Following this guideline to create the Glue connector: https://catalog.us-east-1.prod.workshops.aws/workshops/541dd428-e64a-41da-a9f9-39a7b3ffec17/en-US/lab05-glue-sap
Test the connector to make sure the connection and authentication is succeeded.
Then you need to create Glue ETL Job to read the SAP Odata and write to S3.
(Give the Glue Job's IAM role with proper privileges, like S3 read/write access...)
You can refer to this ETL code:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
# Script for node SAP OData - using correct 'entity' parameter
SAPOData_node = glueContext.create_dynamic_frame.from_options(
connection_type="sapodata",
connection_options={
"connectionName": "Your Sapodata connection",
"ENTITY_NAME": "/sap/opu/odata/sap/API_PRODUCT_SRV/Sample_Product" # Your SAP Odata entity
},
transformation_ctx="SAPOData_node"
)
# Write to S3 destination
output_path = "s3://your-sap-s3-bucket-name/sap-products/"
glueContext.write_dynamic_frame.from_options(
frame=SAPOData_node,
connection_type="s3",
connection_options={
"path": output_path,
"partitionKeys": [] # Add partition keys if needed, e.g., ["ProductType"]
},
format="parquet",
transformation_ctx="S3Output_node"
)
job.commit()
Run the ETL job