Loading Parquet Data within S3 Bucket into StarRocks without Data Ingestion using the external table feature #22723
Closed
Replies: 1 comment
-
Just a note, if you want to load your parquet file into StarRocks for maximum performance, you can create a StarRocks OLAP table and perform a |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This tutorial describes how you can loading Parquet data without data ingestion using the "external table" feature. You can also use this tutorial for accessing parquet data stored in remote or local (#22782) S3-like object store.
Prerequisites
For this tutorial you need to:
A StarRocks or CelerData database cluster.
This is out of scope for the tutorial.
Downloading the Sample Data File
To download the sample Parquet data file, click cities.parquet.
The Parquet data file includes sample continent data. The following is a representative example:
Access to an Object Store and upload the parquet data file
This is out of scope for the tutorial. Please store the URI to the file and any credential needed to login for future usage.
Create a Database, a Table and Query the Data
The following commands create objects specifically for use with this tutorial. When you have completed the tutorial, you can drop these objects.
Step 0: Login to Database
To login into the database, you'll need the server name, host port, username and password.
Step 1: Create Database
Run the create database command.
Step 2: Create Table
Run the create table command.
Issue: We don't support the "variant" snowflake type as this time so that's why we use the "varchar" type for "city". Github Issue #22781
AWS and Minio
Step 3: Run Query
Results should look like this
Step 4: Clean Up
Tip
Beta Was this translation helpful? Give feedback.
All reactions