To achieve the desired output JSON, you need to process both merged cells and the regular data rows, extract their relationships, and format them into the hierarchical JSON structure. Here's how you can do it:
import pandas as pd
from openpyxl import load_workbook
import json
# Path to your Excel file
filepath = r'filename.xlsx'
# Load the workbook and worksheet
wb = load_workbook(filename=filepath)
sheet = wb.active
# Extract merged cell values
merged_cells = {}
for merged_range in sheet.merged_cells.ranges:
start_cell = merged_range.start_cell
for row in range(merged_range.min_row, merged_range.max_row + 1):
for col in range(merged_range.min_col, merged_range.max_col + 1):
merged_cells[(row, col)] = start_cell.value
# Load the Excel sheet into a DataFrame
df = pd.read_excel(filepath, skiprows=2)
# Process DataFrame to construct JSON
result = {}
social_media = {}
timings = {}
# Extract Social Media section
for col in df.columns[:4]: # First four columns are for Social Media
social_media[col] = df[col][0] # Row 0 contains values for Social Media
# Extract Timings section
for col, value in zip(df.columns[4:], df.iloc[0, 4:]): # Remaining columns for Timings
timings[col] = value
# Combine into JSON structure
result["Test"] = {"Social Media": [social_media]}
result["Timings"] = [timings]
# Convert to JSON
json_data = json.dumps(result, indent=2)
print(json_data)
Expected output
{
"Test": {
"Social Media": [
{
"Instagram": "Posts",
"Youtube": "Shorts",
"Twitter": "Tweet",
"Facebook": "Likes
You can also go through some relatable blogs at https://techlusion.io/insight/