{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Managing Tables\n", "\n", "This notebook shows how you can manage and utilize tables with the ThanoSQL library. We will cover how to create, update, delete, and list tables.\n", "First, initialize a ThanoSQL client with your API token and engine URL." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from thanosql import ThanoSQL\n", "\n", "client = ThanoSQL(\n", " api_token=\"THANOSQL_API_TOKEN\",\n", " engine_url=\"THANOSQL_ENGINE_URL\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Listing Tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get the list of stored tables using the ThanoSQL library. The `list()` function can be used without any parameters or with optional parameters. To list only tables in a certain schema, use the `schema` parameter. By default, all tables from all schemas will be listed. You can also set the `offset` and `limit` of the results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tables = client.table.list(schema=\"my_schema\", offset=1, limit=10)\n", "for table in tables:\n", " print(table.name, table.table_schema)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, `list()` only shows the name and schema of stored tables. In order to show more information, enable the `verbose` parameter, which is set to `False` by default." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tables = client.table.list(verbose=True)\n", "for table in tables:\n", " print(table.columns, table.constraints)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also choose to only show a table with a certain name with `get()`. Table and schema combination is unique; this policy is enforced during table creation. If `schema` is not specified, the client will by default look at the `public` schema." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "table = client.table.get(\"table_name\")\n", "table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Listing Table Records" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `get()` method will return a `Table` object, which allows you to interact with the name, schema, columns, constraints, and records of the table. You can use `get_records()` to obtain the contents of the table as a `Records` object, which consists of the `records` and `total` number." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "table_records = table.get_records()\n", "table_records" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you can also get the records as a Pandas DataFrame using the `to_df()` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = table_records.to_df()\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inserting Table Records" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On top of listing records, you can also insert records to a table using the library. Provide a list of the dictionary representation of each new row, and then use it as an input to the `insert()` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = [{\"id\": 1, \"name\": \"Alice\", \"age\": 24}, {\"id\": 2, \"name\": \"Bob\", \"age\": 26}]\n", "table.insert(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**\n", "\n", "Make sure that all the records match the format (shape, column names) of the target table. Otherwise, an error will occur." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating Tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create tables using the library. However, a few extra imports are needed to construct the table object. You need to first construct `Column` objects and `Constraint` objects, if applicable. This is the same process as creating a [table template](./managing_table_templates.ipynb). The created table object will be returned as a response; you can use this value in your code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from thanosql.resources import BaseColumn, Constraints, ForeignKey, PrimaryKey, TableObject, Unique\n", "\n", "\n", "# construct list of columns\n", "columns = [BaseColumn(type=\"integer\", name=\"column_1\"), BaseColumn(type=\"varchar\", name=\"column_2\"), BaseColumn(type=\"double precision\", name=\"column_3\"), BaseColumn(type=\"timestamp\", name=\"column_4\")]\n", "\n", "# construct unique columns (optional)\n", "unique = Unique(name=\"my_unique_1\", columns=[\"column_2\"])\n", "\n", "# construct primary key (optional but recommended)\n", "primary_key = PrimaryKey(name=\"my_pk_1\", columns=[\"column_1\"])\n", "\n", "# construct foreign keys (requires another existing table -- not table template)\n", "foreign_key = ForeignKey(name=\"my_fk_1\", reference_schema=\"public\", reference_column=\"another_column_1\", reference_table=\"another_table\", column=\"column_3\")\n", "\n", "# assemble everything into a table object\n", "table_object = TableObject(columns=columns, constraints=Constraints(unique=[unique], primary_key=primary_key, foreign_keys=[foreign_key]))\n", "\n", "# create the table template using ThanoSQL client, with default schema (public)\n", "res = client.table.create(name=\"my_table_1\", table=table_object)\n", "res" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Uploading Tables from CSV/Excel File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A faster way to create tables is by using already-existing CSV or Excel-like files. You can upload the file to a table directly in your workspace. The table shape and contents will be preserved; the type of each column will be inferred by the library. By default, tables will be created in the `public` schema, unless `schema` is specified. Like `create()`, the created table will be returned as a response." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "res = client.table.upload(name=\"table_from_csv\", file=\"my_csv_file.csv\")\n", "res # Table object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also pass a `TableObject` to `upload()` using the `table` parameter in order to have a more precise table. If `table` is specified, the value will be directly used and table inference will not be conducted. This allows you to provide column types explicitly and specify constraints. In this case, `upload()` will only insert contents from the file into the specified table. Note that because of this, the shape of the `TableObject` and the table from the CSV/Excel-like file must match; otherwise, an error will occur." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client.table.upload(name=\"table_from_excel\", file=\"my_excel_file.xlsx\", table=table_object)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, table uploading will fail if another table of the same name already exists in the same schema. This behavior can be altered by setting the `if_exists` parameter. There are only three available values: \"fail\", \"append\", and \"replace\". \"fail\" is the default behavior. \"append\" will append the file contents to an existing table, if it exists. In this case, the shape of the new table and the old table must be the same. \"replace\" will overwrite the existing table with the new one. For example, if we want to insert the contents of another Excel file to the Excel table we just created, we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client.table.upload(name=\"table_from_excel\", file=\"my_excel_file_continued.xls\", if_exists=\"append\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Uploading Tables from Pandas DataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uploading tables is also possible from a Pandas DataFrame. However, note that you cannot supply both a DataFrame and a file at the same time." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import seaborn as sns\n", "\n", "iris_df = sns.load_dataset('iris')\n", "client.table.upload(name=\"table_from_df\", df=iris_df, schema=\"my_schema\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Updating Tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can update a table once you store it in the database. Similar to creating a table, you need to import a few objects to construct the `Table` (not `TableObject`) object. You can change the name and/or schema of the table by specifying a `name` and `schema` in the `Table`, respectively. `update()` also returns the updated table object, similar to `create()`.\n", "\n", "For example, if we want to rename and update the table we just created by removing all constraints except for the primary key, we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from thanosql.resources import Table\n", "\n", "table_object_updated = Table(name=\"my_table_1_updated\", columns=columns, constraints=Constraints(primary_key=primary_key))\n", "\n", "# update and then check if the new values are passed in correctly\n", "res = client.table.update(name=\"my_table_1\", table=table_object_updated)\n", "res" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deleting Tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can remove tables that you no longer use. You simply need to specify the name of the table if the table resides in the `public` schema. For tables in other schemas, you also have to use the `schema` parameter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client.table.delete(\"my_table_1\")\n", "client.table.delete(name=\"my_table_1_in_my_schema\", schema=\"my_schema\")" ] } ], "metadata": { "kernelspec": { "display_name": "env", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 2 }