Save SQL query results in a pandas DataFrame

You can store the results of your SQL query in a pandas DataFrame. The easiest way to output query results to a DataFrame is to use the SQL editor features of the JupyterLab SQL extension query-result dropdown and choose the Pandas dataframe option.

Alternatively, you can add the parameter --output '{"format": "DATAFRAME", "dataframe_name": "dataframe_name"}' to your connection string.

For example, the following query extracts details of customers with the highest balance from the Customer table in Snowflake's TPCH_SF1 database, using both pandas and SQL:

In this example, we extract all the data from the customer table and save then in a DataFrame named all_customer_data.


%%sm_sql --output '{"format": "DATAFRAME", "dataframe_name": "all_customer_data"}' --metastore-id snowflake-connection-name --metastore-type GLUE_CONNECTION
SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER


Saved results to all_customer_data

Next, we extract the details of the highest account balance from the DataFrame.


all_customer_data.loc[all_customer_data['C_ACCTBAL'].idxmax()].values


array([61453, 'Customer#000061453', 'RxNgWcyl5RZD4qOYnyT3', 15,
'25-819-925-1077', Decimal('9999.99'), 'BUILDING','es. carefully regular requests among the blithely pending requests boost slyly alo'],
dtype=object)

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Create a simple connection

Override connection properties