Yelp data set: converting from json to csv

This is a complimentary explanation about how to convert Yelp dataset ( in json format into csv file(s) so we could use them in data analytics. The topic is covered in a Coursera class that I’ve worked on – Social Media Data Analytics (

Yelp JSON File

Suppose you are converting a business data in json format (e.g., yelp_academic_dataset_business.json, which you downloaded from the Yelp site).

The data structure in the file would be as follows:


    'type': 'business',
    'business_id': (encrypted business id),
    'name': (business name),
    'neighborhoods': [(hood names)],
    'full_address': (localized address),
    'city': (city),
    'state': (state),
    'latitude': latitude,
    'longitude': longitude,
    'stars': (star rating, rounded to half-stars),
    'review_count': review count,
    'categories': [(localized category names)]
    'open': True / False (corresponds to closed, not business hours),
    'hours': {
        (day_of_week): {
            'open': (HH:MM),
            'close': (HH:MM)
    'attributes': {
        (attribute_name): (attribute_value),


Python Script

Meanwhile, the python script for the converting (e.g., is as follows:


import sys
import json
import csv

ifilename = sys.argv[1]
 ofilename = sys.argv[2]
 ofilename = ifilename + ".csv"

json_lines = [json.loads( l.strip() ) for l in open(ifilename).readlines() ]
OUT_FILE = open(ofilename, "w")
root = csv.writer(OUT_FILE)
json_no = 0 
for l in json_lines:
 root.writerow([l["business_id"], l["name"],l["full_address"],l["hours"],l["open"],l["categories"],l["city"],l["state"],l["review_count"],l["stars"]])
 json_no += 1

print('Finished {0} lines'.format(json_no)) 

As you can see, this script only imports several features related to business such as business_id, name, full_address, hours, open, categories, etc. shown in the code. Then it reads each line in the json file to pick values to the selected features.



Now you can convert the json file into a csv file running the following command in your system console or Python IDEs:

>> python yelp_academic_dataset_business.json