%load_ext sql
%sql duckdb://
Loading configurations from C:\Users\NKeleher\code\ipa-data-tech-handbook\pyproject.toml.
Settings changed:
| Config | value | 
|---|---|
| feedback | 0 | 
| autopandas | False | 
| autopolars | False | 
| autolimit | 0 | 
| displaylimit | 10 | 
%load_ext sql
%sql duckdb://| Config | value | 
|---|---|
| feedback | 0 | 
| autopandas | False | 
| autopolars | False | 
| autolimit | 0 | 
| displaylimit | 10 | 
Create a table from a CSV file.
%%sql
CREATE TABLE penguins AS SELECT * FROM '../assets/data/penguins.csv'| Count | 
|---|
Inspect the columns of the table.
%sqlcmd columns -t penguins| name | type | nullable | default | autoincrement | comment | 
|---|---|---|---|---|---|
| species | VARCHAR | True | None | False | None | 
| island | VARCHAR | True | None | False | None | 
| bill_length_mm | FLOAT | True | None | False | None | 
| bill_depth_mm | FLOAT | True | None | False | None | 
| flipper_length_mm | BIGINT | True | None | False | None | 
| body_mass_g | BIGINT | True | None | False | None | 
| sex | VARCHAR | True | None | False | None | 
Query the table using a SELECT statement.
%%sql
SELECT *
FROM '../assets/data/penguins.csv'
LIMIT 3| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | 
|---|---|---|---|---|---|---|
| Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | MALE | 
| Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | FEMALE | 
| Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | FEMALE | 
Run analytical operations on the data.
%%sql
SELECT species, COUNT(*) AS count
FROM '../assets/data/penguins.csv'
GROUP BY species
ORDER BY count DESC| species | count | 
|---|---|
| Adelie | 152 | 
| Gentoo | 124 | 
| Chinstrap | 68 | 
Using jupysql store SQL query results as an object in a Jupyter notebook.
%%sql species_count <<
SELECT species, COUNT(*) AS count
FROM '../assets/data/penguins.csv'
GROUP BY species
ORDER BY count DESCPlot the results of the query.
ax = %sqlplot bar --table '../assets/data/penguins.csv' --column species
# customize plot (this is a matplotlib Axes object)
_ = ax.set_title("Count of penguins by species")
Use SQLFluff to lint SQL code.
Add SQLFluff to your virtual environment using pixi:
pixi add sqlfluff --pypiAssuming SQLFluff is installed in your virtual environment using pixi, you can run the following command to lint SQL code:
pixi run sqlfluff fix --dialect duckdb