The mission of the Terra Data Catalog is to make research data accessible and searchable to accelerate biomedical discoveries.
If you are a new member of the Broad, follow the getting started guide first.
Ensure you have Java 17 and that it is the default. To check this while in the
terra-data-catalog
directory, type java --version
.
Then, to build the code without executing tests, run:
./gradlew build -x test
If you don't include -x test
ensure the Postgres database is initialized as
described below.
For tests, ensure you have a local Postgres instance running. While in the
terra-data-catalog
directory, initialize the database:
psql -f scripts/postgres-init.sql
After the database is initialized, then run integration tests:
./scripts/render_configs.sh # render service account credentials needed for tests
./gradlew bootRun & # start up a local instance of the data catalog service
sleep 5 # wait until service comes up
./gradlew runTest --args="suites/local/FullIntegration.json build/reports"
The catalog service uses Liquibase to track and manage changes to the database schema. Liquibase runs each changeset (migration) listed in the changelog.xml file and maintains a record of what has been run, so new changes must be added in a new changeset.
To run migrations locally use:
./gradlew update
If the local database gets into a bad state (for instance while testing/modifying a new changeset), drop its contents with:
./gradlew dropAll
SourceClear is a static analysis tool that scans a project's Java dependencies for known vulnerabilities. If you get a build failure due a SourceClear error and want to debug the problem locally, you need to get the API token from vault before running the gradle task.
export SRCCLR_API_TOKEN=$(vault read -field=api_token secret/secops/ci/srcclr/gradle-agent)
./gradlew srcclr
Sonar is a static analysis code that scans code for a wide range of issues, including maintainability and possible bugs. If you get a build failure due to Sonar and want to debug the problem locally, you need to get the the sonar token from vault before runing the gradle task.
export SONAR_TOKEN=$(vault read -field=sonar_token secret/secops/ci/sonarcloud/catalog)
./gradlew sonar
Unlike SourceClear, running this task produces no output unless your project has errors. To always
generate a report, run using --info
:
./gradlew sonar --info
Datasets in the catalog (also known as catalog entries) have a JSON schema that they must conform to. The current schema for dataset entries is schema.json, and an example entry that conforms to this schema is example.json. This schema is based on the TerraDCAT_AP model from the Terra Interoperability Model.
The example entry was generated from the JSON schema using the online tool JSON Schema Faker.