The following sections describe the quality conventions and best practices that apply to the development phase of a software component within the EOSC ecosystem. These guidelines ruled the software development process of the former European Commission-funded project INDIGO-DataCloud, where they have proved valuable for improving the reliability of software produced in the scientific European arena.
The next sections describe the development process driven by a change-based strategy, followed by a continuous integration approach. Changes in the source code, trigger automated builds to analyze the new contributions in order to validate them before being added to the software component code base. Consequently, software components are more eligible for deployment in production infrastructures, reducing the likelihood of service disruption.
-
[QC.Acc01] Following the open-source model, the source code being produced MUST be open and publicly available to promote the adoption and augment the visibility of the software developments.
-
[QC.Acc02] Source code MUST use a Version Control System (VCS).
- [QC.Acc02.1] It is RECOMMENDED that all software components delivered by the same project agree on a common VCS.
-
[QC.Acc03] Source code produced within the scope of a broader development project SHOULD reside in a common organization of a version control repository hosting service.
A change-based approach is accomplished with a branching model.
-
[QC.Wor01] The main branch in the source code repository MUST maintain a working state version of the software component.
-
[QC.Wor01.1] Main branch SHOULD be protected to disallow force pushing, thus preventing untested and un-reviewed source code from entering the production-ready version.
-
[QC.Wor01.2] New features SHOULD only be merged in the main branch whenever the SQA criteria is fulfilled.
-
-
[QC.Wor02] New changes in the source code MUST be placed in individual branches.
- [QC.Wor02.1] It is RECOMMENDED to agree on a branch nomenclature, usually by prefixing, to differentiate change types (e.g. feature, release, fix).
-
[QC.Wor03] The existence of a secondary long-term branch that contains the changes for the next release is RECOMMENDED.
-
[QC.Wor03.1] Next release changes SHOULD come from the individual branches.
-
[QC.Wor03.2] Once ready for release, changes in the secondary long-term branch are merged into the main branch and versioned. At that point in time, main and secondary branches are aligned. This step SHOULD mark a production release.
-
-
[QC.Man01] An issue tracking system facilitates structured software development. Leveraging issues to track down both new enhancements and defects (bugs, documentation typos) is RECOMMENDED.
-
[QC.Man01.1] In addition to monitoring the internal development, issues are the best means for supporting users. External users SHOULD be able to create issues based on the operational performance of the software.
-
[QC.Man01.2] The description of an issue SHOULD be concise and state clearly the problem. It is RECOMMENDED to add any reference to the actual problem. In the case of bugs, the issue SHOULD be accompanied by the relevant debug information. * The usage of templates for the issue description is RECOMMENDED.
-
-
[QC.Man02] In social coding environments, pull or merge requests represent the cornerstone of collaboration. A pull or merge request provides a place for review and discussion of the changes proposed to be part of an existing version of the code.
-
[QC.Man02.1] Pull/Merge requests SHOULD be used for every change in the codebase.
-
[QC.Man02.2] A software project SHOULD be open to external collaboration through pull/merge requests.
-
[QC.Man02.3] A pull/merge request description SHOULD be concise and state clearly its purpose (e.g. if it is fixing an observed bug or adding a new feature).
-
[QC.Man02.4] The usage of templates for the pull/merge request's description is RECOMMENDED.
-
[QC.Man02.5] It is RECOMMENDED to use pull/merge requests to address open issues.
-
[QC.Man02.6] The pull/merge request description SHOULD make reference to the identifiers of the issues it is fixing (to eventually close them, either manually or automatically).
-
Code review implies the informal, non-automated, peer review of any change in the source code [@https://owasp.org/www-project-code-review-guide/migrated_content]. It appears as the last step in the change management pipeline, once the candidate change has successfully passed over the required set of change-based tests.
-
[QC.Rev01] Code reviews MUST be done in the agreed peer review tool within the project, with the following RECOMMENDED functionality:
-
[QC.Rev01.1] Allows general and specific comments on the line or lines that need to be reviewed.
-
[QC.Rev01.2] Shows the results of the required change-based test executions.
-
[QC.Rev01.3] Allows to prevent merges of the candidate change whenever not all the required tests are successful. Exceptions to this rule cover the third-party or upstream contributions which MAY use the existing mechanisms or tools for code review provided by the target software project. This exception MUST only be allowed whenever the external revision lifecycle does not interfere with the project deadlines.
-
-
[QC.Rev02] Code reviews MUST be open and collaborative, allowing external experts revisions.
-
[QC.Rev03] Code reviews SHOULD be concise and use neutral language. The areas where the reviewers MAY focus are:
-
[QC.Rev03.1] Message description: commit message is clear, self-explanatory and describes precisely the objectives being addressed.
-
[QC.Rev03.2] Goal or scope: change is needed and/or addresses/fixes the whole set of objectives.
-
[QC.Rev03.3] Code analysis: useless statements in the code, library or modules imported but never used or code style suggestions.
-
[QC.Rev03.4] Review of required tests: check if they include tests of the changes, such as tests of new features, or tests of bug fixing (regression tests), ensuring proper validation of the changes.
-
[QC.Rev03.5] Review of documentation: whether the change SHOULD bring along a corresponding update in the documentation.
-
-
[QC.Rev04] Code reviews MUST be checked on change basis.
-
[QC.Rev05] Code reviews SHOULD assess the inherent security risk of the changes, ensuring that the security model has not been downgraded or compromised by the changes.
- [QC.Ver01] Semantic Versioning [@https://semver.org] specification is RECOMMENDED for tagging the production releases.
-
[QC.Lic01] As open-source software, source code MUST adhere to an open-source license to be freely used, modified and distributed by others. Non-licensed software is exclusive copyright by default.
- [QC.Lic01.1] Licenses MUST be physically present (e.g. as a LICENSE file) in the root of all the source code repositories related to the software component.
-
[QC.Lic02] License MUST be compliant with the Open Source Definition [@https://opensource.org/osd].
- [QC.Lic02.1] RECOMMENDED licenses are listed in the Open Source Initiative portal under the Popular Licenses category [@https://opensource.org/licenses], c.f. the complete list of Software Package Data Exchange (SPDX) [@https://spdx.org/licenses].
Metadata for the software component provides a way to achieve its full identification, thus making software citation viable [@doi:10.7717/peerj-cs.86]. It allows the assignment of a Digital Object Identifier (DOI) and is key towards preservation, discovery, reuse, and attribution of the software component.
- [QC.Met01] A metadata file SHOULD exist along side the code, under its VCS. The metadata file SHOULD be updated when needed, as is the case of a new version.
-
[QC.Doc01] Documentation MUST be treated as code.
- [QC.Doc01.1] Version controlled, it MAY reside in the same repository where the source code lies.
-
[QC.Doc02] Documentation MUST use plain text format using a markup language, such as Markdown or reStructuredText.
- [QC.Doc02.1] It is RECOMMENDED that all software components delivered by the same project agree on a common markup language.
- [QC.Doc2.2] Each individual documentation file SHOULD comply with community-driven or de-facto style standards for the markup languages being used.
-
[QC.Doc03] Documentation MUST be online and available in a documentation repository.
- [QC.Doc03.1] Documentation SHOULD be rendered automatically.
-
[QC.Doc04] Documentation MUST be updated on new software versions involving any substantial or minimal change in the behavior of the application.
-
[QC.Doc05] Documentation MUST be updated whenever reported as inaccurate or unclear.
-
[QC.Doc06] Documentation MUST be produced according to the target audience, varying according to the software component specification. The identified types of documentation and their RECOMMENDED content are:
-
[QC.Doc06.1] README file MUST be present:
- One-paragraph description of the application.
- A "Getting Started" step-by-step description on how to get a development environment running (prerequisites, installation).
- Automated test execution how-to.
- Links to the external documentation below (production deployment, user guides).
- Versioning specification.
- Author list and contacts.
- Acknowledgments.
-
[QC.Doc06.2] CONTRIBUTING file MUST be present in order to communicate how external parties can contribute to the code.
-
[QC.Doc06.3] A code of conduct (usually defined in a CODE_OF_CONDUCT file) MUST be present in order to establish the positive social attitudes expected within the community of code contributors.
-
[QC.Doc06.4] LICENSE file MUST be present, License information with detailed description.
-
[QC.Doc06.5] Developer
- Private API documentation.
- Structure and interfaces.
- Build documentation.
-
[QC.Doc06.6] Deployment and Administration
- Installation and configuration guides.
- Service Reference Card, with the following RECOMMENDED content:
- Brief functional description.
- List of processes or daemons.
- Init scripts and options.
- List of configuration files, location and example or template.
- Log files location and other useful audit information.
- List of ports.
- Service state information.
- List of cron jobs.
- Security information.
- FAQs and troubleshooting.
-
[QC.Doc06.7] User
- Public API documentation.
- Command Line Interface (CLI) reference.
-
Code style requirements pursue the correct maintenance of the source code by the common agreement of a series of style conventions. These vary based on the programming language being used.
-
[QC.Sty01] Each individual software product MUST comply with community-driven or de-facto code style standards for the programming languages being used.
- [QC.Sty01.1] Compliance with multiple complementary standards MAY exist.
-
[QC.Sty02] Custom code style guidelines SHOULD be avoided, only considered in the hypothetical event of programming languages without existing community style standards.
-
[QC.Sty02.1] Custom styles MUST be documented by defining each convention and its expected output.
-
[QC.Sty02.2] Custom styles SHOULD evolve over time towards a more consistent definition.
-
-
[QC.Sty03] Exceptions of individual conventions from the main definition are allowed but SHOULD be avoided
- [QC.Sty03.1] Absence of standard conventions SHOULD be justified and tracked.
-
[QC.Sty04] Code style compliance testing MUST be automated and MUST be triggered for each candidate change in the source code.
Unit testing evaluates all the possible flows in the internal design of the code, so that its behavior becomes apparent. It is a key type of testing for early detection of failures in the development cycle.
-
[QC.Uni01] Minimum acceptable code coverage threshold SHOULD be 70%.
-
[QC.Uni01.1] Unit testing coverage SHOULD be higher for those sections of the code identified as critical by the developers, such as units part of a security module.
-
[QC.Uni01.2] Unit testing coverage MAY be lower for external libraries or pieces of code not maintained within the product’s code base.
-
-
[QC.Uni02] Units SHOULD reside in the repository code but separated from the main code.
-
[QC.Uni03] Unit testing coverage MUST be checked on change basis.
-
[QC.Uni04] Unit testing coverage MUST be automated.
In software development, a test harness [@https://searchsoftwarequality.techtarget.com/definition/test-harness], is a collection of software and test data used by developers to unit test software models during development. A test harness will specifically refer to test doubles, which are programs that interact with the software being tested. Once a test harness is used to execute a test, they can also utilize a test library to generate reports.
It is also a simple form of Integration Testing, where interaction and integration with external components are substituted by a Double.
Test Double is a generic term for any case where you replace a production object for testing purposes. There are various kinds of double [@isbn:9780131495050]:
-
Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
-
Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (such as an InMemoryTestDatabase).
-
Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test.
-
Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages where sent.
-
Mocks are pre-programmed with expectations which form a specification of the calls they are expected to receive. They can throw an exception if they receive a call they don't expect and are checked during verification to ensure they got all the calls they were expecting.
As such the following criteria is defined for Test Harness:
-
[QC.Har01] When working on automated testing, the use of Test Doubles is RECOMMENDED to mimic a simplistic behavior of objects and procedures.
-
[QC.Har02] Test Doubles SHOULD reside in the software component repository code base but separated from the main code.
-
[QC.Har03] Regression testing, that checks the conformance with previous tests, SHOULD be covered at this stage by executing the complete set of Test Doubles available.
-
[QC.Har04] Test Doubles and regression, MUST be checked on change basis.
Test-Driven Development [@isbn:9780321146533], is a software development process relying on software requirements being converted to test cases before software is fully developed, and tracking all software development by repeatedly testing the software against all test cases. This is opposed to software being developed first and test cases created later.
- [QC.Tdd01] Software requirements SHOULD be converted to test cases, and these test cases SHOULD be checked automatically.
Security assessment is essential for any production Software. An effective implementation of the security requirements applies to every stage in the Software Development Life Cycle (SDLC), especially effective at the source code level.
-
[QC.Sec01] Secure coding practices MUST be applied into all the stages of a software component development lifecycle.
- [QC.Sec01.1] Compliance with Open Web Application Security Project (OWASP) secure coding guidelines [@https://owasp.org/www-project-secure-coding-practices-quick-reference-guide/migrated_content] is RECOMMENDED, even for non-web applications.
-
[QC.Sec02] Source code MUST use automated linter tools to perform static application security testing (SAST) [@https://owasp.org/www-community/Source_Code_Analysis_Tools] that flag common suspicious constructs that may cause a bug or lead to a security risk (e.g. inconsistent data structure sizes or unused resources).
-
[QC.Sec03] Security code reviews [@https://owasp.org/www-project-code-review-guide/migrated_content] for certain vulnerabilities SHOULD be done as part of the identification of potential security flaws in the code. Inputs SHOULD come from automated linters.
-
[QC.Sec04] World-writable files or directories MUST NOT be present in the product’s configuration or logging locations.
Automated delivery comprises the build of Software into an artifact, its upload/registration into a public repository of such artifacts and notification of the success of the process.
-
[QC.Del01] Production-ready code MUST be built as an artifact that can be efficiently executed on a system.
- [QC.Del01.1] The built artifact SHOULD be as minimal as possible, including no more than the precise runtime environment and dependencies required for the execution of the software.
-
[QC.Del02] The built artifact MUST be uploaded and registered into a public repository of such artifacts.
-
[QC.Del03] Upon success of the previous (QC.Del02) process, a notification MUST be sent to pre-defined parties such as the main developer or team.
-
[QC.Dep01] Production-ready code MUST be deployed as a workable system with the minimal user or system administrator interaction leveraging software configuration management (SCM) tools.
-
[QC.Dep02] A SCM module is treated as code.
- [QC.Dep02.1] Version controlled, it SHOULD reside in a different repository than the source code to facilitate the distribution.
-
[QC.Dep03] It is RECOMMENDED that all software components delivered by the same project agree on a common SCM tool.
- [QC.Dep03.1] However, software products MAY not be restricted to provide a unique solution for the automated deployment.
-
[QC.Dep04] Any change affecting the application’s deployment or operation MUST be subsequently reflected in the relevant SCM modules.
-
[QC.Dep05] Official repositories provided by the main developer SHOULD be used to host the SCM modules, thus augmenting the visibility and promote external collaboration.