Managing Tables

This notebook shows how you can manage and utilize tables with the ThanoSQL library. We will cover how to create, update, delete, and list tables. First, initialize a ThanoSQL client with your API token and engine URL.

from thanosql import ThanoSQL

client = ThanoSQL(
    api_token="THANOSQL_API_TOKEN",
    engine_url="THANOSQL_ENGINE_URL"
)

Listing Tables

You can get the list of stored tables using the ThanoSQL library. The list() function can be used without any parameters or with optional parameters. To list only tables in a certain schema, use the schema parameter. By default, all tables from all schemas will be listed. You can also set the offset and limit of the results.

tables = client.table.list(schema="my_schema", offset=1, limit=10)
for table in tables:
    print(table.name, table.table_schema)

By default, list() only shows the name and schema of stored tables. In order to show more information, enable the verbose parameter, which is set to False by default.

tables = client.table.list(verbose=True)
for table in tables:
    print(table.columns, table.constraints)

You can also choose to only show a table with a certain name with get(). Table and schema combination is unique; this policy is enforced during table creation. If schema is not specified, the client will by default look at the public schema.

table = client.table.get("table_name")
table

Listing Table Records

The get() method will return a Table object, which allows you to interact with the name, schema, columns, constraints, and records of the table. You can use get_records() to obtain the contents of the table as a Records object, which consists of the records and total number.

table_records = table.get_records()
table_records

Alternatively, you can also get the records as a Pandas DataFrame using the to_df() method.

df = table_records.to_df()
df

Inserting Table Records

On top of listing records, you can also insert records to a table using the library. Provide a list of the dictionary representation of each new row, and then use it as an input to the insert() method.

data = [{"id": 1, "name": "Alice", "age": 24}, {"id": 2, "name": "Bob", "age": 26}]
table.insert(data)

Note

Make sure that all the records match the format (shape, column names) of the target table. Otherwise, an error will occur.

Creating Tables

You can create tables using the library. However, a few extra imports are needed to construct the table object. You need to first construct Column objects and Constraint objects, if applicable. This is the same process as creating a table template. The created table object will be returned as a response; you can use this value in your code.

from thanosql.resources import BaseColumn, Constraints, ForeignKey, PrimaryKey, TableObject, Unique


# construct list of columns
columns = [BaseColumn(type="integer", name="column_1"), BaseColumn(type="varchar", name="column_2"), BaseColumn(type="double precision", name="column_3"), BaseColumn(type="timestamp", name="column_4")]

# construct unique columns (optional)
unique = Unique(name="my_unique_1", columns=["column_2"])

# construct primary key (optional but recommended)
primary_key = PrimaryKey(name="my_pk_1", columns=["column_1"])

# construct foreign keys (requires another existing table -- not table template)
foreign_key = ForeignKey(name="my_fk_1", reference_schema="public", reference_column="another_column_1", reference_table="another_table", column="column_3")

# assemble everything into a table object
table_object = TableObject(columns=columns, constraints=Constraints(unique=[unique], primary_key=primary_key, foreign_keys=[foreign_key]))

# create the table template using ThanoSQL client, with default schema (public)
res = client.table.create(name="my_table_1", table=table_object)
res

Uploading Tables from CSV/Excel File

A faster way to create tables is by using already-existing CSV or Excel-like files. You can upload the file to a table directly in your workspace. The table shape and contents will be preserved; the type of each column will be inferred by the library. By default, tables will be created in the public schema, unless schema is specified. Like create(), the created table will be returned as a response.

res = client.table.upload(name="table_from_csv", file="my_csv_file.csv")
res  # Table object

You can also pass a TableObject to upload() using the table parameter in order to have a more precise table. If table is specified, the value will be directly used and table inference will not be conducted. This allows you to provide column types explicitly and specify constraints. In this case, upload() will only insert contents from the file into the specified table. Note that because of this, the shape of the TableObject and the table from the CSV/Excel-like file must match; otherwise, an error will occur.

client.table.upload(name="table_from_excel", file="my_excel_file.xlsx", table=table_object)

By default, table uploading will fail if another table of the same name already exists in the same schema. This behavior can be altered by setting the if_exists parameter. There are only three available values: “fail”, “append”, and “replace”. “fail” is the default behavior. “append” will append the file contents to an existing table, if it exists. In this case, the shape of the new table and the old table must be the same. “replace” will overwrite the existing table with the new one. For example, if we want to insert the contents of another Excel file to the Excel table we just created, we can do the following:

client.table.upload(name="table_from_excel", file="my_excel_file_continued.xls", if_exists="append")

Uploading Tables from Pandas DataFrame

Uploading tables is also possible from a Pandas DataFrame. However, note that you cannot supply both a DataFrame and a file at the same time.

import seaborn as sns

iris_df = sns.load_dataset('iris')
client.table.upload(name="table_from_df", df=iris_df, schema="my_schema")

Updating Tables

You can update a table once you store it in the database. Similar to creating a table, you need to import a few objects to construct the Table (not TableObject) object. You can change the name and/or schema of the table by specifying a name and schema in the Table, respectively. update() also returns the updated table object, similar to create().

For example, if we want to rename and update the table we just created by removing all constraints except for the primary key, we can do the following:

from thanosql.resources import Table

table_object_updated = Table(name="my_table_1_updated", columns=columns, constraints=Constraints(primary_key=primary_key))

# update and then check if the new values are passed in correctly
res = client.table.update(name="my_table_1", table=table_object_updated)
res

Deleting Tables

You can remove tables that you no longer use. You simply need to specify the name of the table if the table resides in the public schema. For tables in other schemas, you also have to use the schema parameter.

client.table.delete("my_table_1")
client.table.delete(name="my_table_1_in_my_schema", schema="my_schema")