-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upload file: allow user to specify the schema #31
Comments
Can you say more, with a concrete example? Daniel Halperin On Wed, Dec 3, 2014 at 11:34 AM, Shumo Chu [email protected] wrote:
|
One user case: We are trying to use Myria to process Google cluster usage data, each table has 10-20 columns. There are a lot of missing values in the CSV. During the pre-processing, we replaced the missing value with -1 (google just place nothing between two commas). The reason of using
|
why not do the cleaning inside of Myria?
Daniel Halperin On Wed, Dec 3, 2014 at 11:52 AM, Shumo Chu [email protected] wrote:
|
Dose cleaning inside myria mean ingesting the data as all string columns? Then this again needs manually specify schema. If just simply using the upload tool, not sure what messy table will do since many null value's appear after the preview. Will update this issue once I get more data or bugs. |
when the tool encounters an empty cell in the dataset, the column's schema Daniel Halperin On Wed, Dec 3, 2014 at 12:14 PM, Shumo Chu [email protected] wrote:
|
Automatically infering schema is very cool. But sometimes user may still want to specify schema, because:
Just my personal opinion.
The text was updated successfully, but these errors were encountered: