Releases: d2iq-archive/marathon
v1.5.5
Changes from 1.5.4 to 1.5.5
- MARATHON-7974 Fixes a security concern for a non super user. It was discovered that permissions for GroupResources was not honored and is now resolved.
- MARATHON-7919 Fix byte stream blocking on proxy of non-leader buffers
changes: v1.5.4...v1.5.5
v1.3.14
Changes from 1.3.13 to 1.3.14
- MARATHON-7974 Fixes a security concern for a non super user. It was discovered that permissions for GroupResources was not honored and is now resolved.
changes: v1.3.13...v1.3.14
v1.5.4
Changes from 1.5.3 to 1.5.4
Fixed issues
- MARATHON_EE-1773 Enable
killGracePeriodSeconds
for Pods
Diff: v1.5.3...v1.5.4
v1.5.3
Changes from 1.5.2 to 1.5.3
Bugfix release
Fixed issues
- MARATHON_EE-1701 Pods now respect hostname: UNIQUE constraint (#5793)
- MARATHON_EE-1768 Proper reading of secrets from volumes (#5791)
- MARATHON_EE-1764 Speed up service port assignment
- MARATHON_EE-1770 Speed up service port assignment (#5796)
Diff: v1.5.2...v1.5.3
v1.5.2
Changes from 1.5.1 to 1.5.2
Bugfix release
Fixed issues
- MARATHON-7790 Migrate UnreachableStrategy Saved in Instances
- Change
Int.MaxValue
to 8 for maximum concurrency during migration (#5676) (#5701) - MARATHON-7848 Allow underscore for network names (#5687)
- MARATHON-7788 Store a flag node in ZK to indicate that Marathon performs data migration (#5662)
- MARATHON-7852 Switch to use Debian Slim base image (#5668)
- Extend the set of command-line flags returned by /v2/info (#5612)
- MARATHON-7784 Do not ignore exceptions when resolving apps and pods in GroupRepository (#5607)
- Update Mesos version to 1.4.0 which is used for building packages and Docker images (#5606)
- MARATHON-7763 Do sync before reading from or writing to ZooKeeper (#5566)
- Fixes bug in which some Condition values were improperly read (#5555) (#5557)
- Added an option to disable a plugin (#5524)
v1.4.9
Changes from 1.4.8 to 1.4.9
Fixed issues
- MARATHON-7790 Migrate UnreachableStrategy Saved in Instances
- MARATHON-7806 Fixes bug in which some Condition values were improperly read
- MARATHON-7685 - Simplify leadership code
- MARATHON-7763 - Proper zkSync Locks
- MARATHON-7785 - Run zkSync after leader election, and before any state is read
- MARATHON-7788 - Add ZK Flag for data migrations
1.4.9 New Behavior
Migrating unreachableStrategy - running instances
If you already migrated your apps and pods to the new default behavior for UnreachableStrategy
, you also should consider to migrate the running instances as well.
To change the unreachableStrategy
of all running instances, set the environment variable MIGRATION_1_4_6_UNREACHABLE_STRATEGY
to true
, which leads to the following behavior during migration:
When opting in to the unreachable migration step
- all instances that had a config of
UnreachableStrategy(300 seconds, 600 seconds)
(previous default) are migrated to haveUnreachableStrategy(0 seconds, 0 seconds)
- all instances that had a config of
UnreachableStrategy(1 second, x seconds)
are migrated to haveUnreachableStrategy(0 seconds, x seconds)
- all instances that had a config of
UnreachableStrategy(1 second, 2 seconds)
are migrated to haveUnreachableStrategy(0 seconds, 0 seconds)
Note: If you set this variable after upgrading to 1.4.9, it will have no effect.
v1.5.1.1
Changes from 1.5.0 to 1.5.1.1
Hotfix release
Fixed issues
- MARATHON-7848 Fixes regression in which underscores were no longer permitted in network names
v1.5.1
Changes from 1.5.0 to 1.5.1
Bugfix release
Fixed issues
- MARATHON-7576 Changed default of UnreachableEnabled to (0,0)
- D907 TaskLauncherActor doesn't wait for in-flight tasks on stop
- MARATHON-7765 Fixes issue in which /v2/info endpoint always returned 1.5.0-snapshot1, regardless of the actual endpoint.
- PR 5421 Added SchedulerPlugin to enable the ability to customize the rejection of offers. (see below)
- MARATHON-2520 Improved logging around migration
- D1044 EventStream implementation moved to Akka eventStream
- MARATHON-7545 Initialize RunSpecTaskProcessor and RunSpecValidator at startup to early detect misconfiguration.
- D974 Plugin configuration or initialization issues are made more obvious, potentially causing Marathon to not launch.
- MARATHON-7707 Resident tasks now have an up-to-date agentInfo (agentId) when they are re-launched, rather than preserving the agentInfo as received during initial launch.
- MARATHON-7724 Better socket error handling leader proxy.
- MARATHON-7711, MARATHON-7338 Under certain circumstances, resident tasks wouldn't relaunch when resources were available, and reservations wouldn't be freed. In order to address this, Marathon no longer suppresses offers from Mesos.
- PR 5432 App and pod validation errors for missing network name.
- MARATHON-1703 Fixed issue in which constraints would not be properly evaluated when launching multiple resident tasks at a time.
1.5.1 New Behavior
mesosphere.marathon.plugin.scheduler.SchedulerPlugin
This plugin allows to reject offers. Possible use-cases are:
- Maintenance. Mark agent as going to maintenance and reject new offers from it.
- Analytics. If task fails, for example, 5 times for 5 minutes, we can assume that it will fail again and reject new offers for it.
- Binding to agents. For example, agents can be marked as included into primary or secondary group. Task can be marked with group name. Plugin can schedule task deployment to primary agents. If all primary agents are busy, task can be scheduled to secondary agents
v1.4.8
Changes from 1.4.7 to 1.4.8
Fixed issues
- MARATHON-7697 - CLUSTER constraint doesn't require a value.
- MARATHON-7724 - Better error handling around proxy leader failure.
- MARATHON-7728 - Always log exceptions when performing migration.
- MARATHON-7696 - Do a better job at maintaining task failure rate limiting values per RunSpec.
v1.5.0
Changes from 1.4.x to 1.5.0
Recommended Mesos version is 1.3.0
Breaking Changes
Packaging standardized
We now publish more normalized packages that attempt to follow Linux Standard Base Guidelines and use sbt-native-packager to achieve this.
As a result of this and the many historic ways of passing options into marathon, we will only read /etc/default/marathon
when starting up.
This file, like /etc/sysconfig/marathon
, has all marathon command line options as "MARATHON_XXX=YYY" which will translate to --xx=yyy
.
We no longer support /etc/marathon/conf which was a set of files that would get translated into command line arguments. In addition,
we no longer assume that if there is no zk/master argument passed in, then both are running on localhost.
If support for any of the above is important to you, please file a JIRA and/or create a PR/Patch.
App JSON Fields Changed or Moved.
Marathon will continue to accept the app JSON as it did in 1.4;
however, applications that use deprecated fields will be normalized into a canonical representation.
The app JSON generated by the /v2 REST API has changed: only canonical fields are generated.
The App RAML specification is the source of truth with respect to deprecated fields.
The following deprecated fields will no longer be generated for app JSON:
ipAddress
container.docker.portMappings
container.docker.network
ports
uris
Marathon clients that consume these deprecated fields will require changes.
In addition, new networking API fields have been introduced:
networks
container.portMappings
The networks
field replaces the ipAddress.networkName
and container.docker.network
fields, and supports joining an app to multiple container
networks.
The legacy IP/CT API did not require a resolvable network name in order to use a container
network;
it allowed both an app definition to leave ipAddress.networkName
unspecified and the operator to leave --default_network_name
unspecified.
Starting with Marathon v1.5 such apps will be rejected: apps may leave networks[x].name
unspecified for container
networks only if --default_network_name
has been specified by the operator.
Marathon injects the value of --default_network_name
into unnamed container
networks upon app create/update.
Upgrading from Marathon 1.4.x to Marathon 1.5.x will automatically migrate existing applications to the new networking API.
Migration of legacy Mesos IP/CT apps may fail if those apps did not specify ipAddress.networkName
and there is no default network name specified.
See the (networking documentation)[docs/docs/networking.md] for details concerning app migration and network API changes.
The old app networking docs have been relocated.
See the networking documentation for details concerning the new API.
Metric Names Changed or Moved.
We moved to a different Metrics library and the metrics are not always compatible or the same as existing metrics;
however, the metrics are also now more accurate, use less memory, and are expected to get better throughout the release.
Where it was possible, we maintained the original metric names/groupings/etc, but some are in new locations or have
slightly different semantics. Any monitoring dashboards should be updated.
Before 1.5.0 releases, we will publish a migration guide for the new metric formats and where the replacement
metrics can be found and the formats they are now in.
Artifact store has been removed
The artifact store was deprecated with Marathon 1.4 and is removed in version.
The command line flag --artifact_store
will throw an error if specified.
The REST API endpoint /v2/artifacts
has been removed completely.
Logging endpoint
Marathon has the ability to view and change log level configuration during runtime via the /logging
endpoint.
This version switches from a form based API to a JSON based API, while maintaining the functionality.
We also secured this endpoint, so you can restrict who is allowed to view or update this configuration.
Please find our API documentation for all details.
Event Subscribers has been removed.
The events subscribers endpoint (/v2/eventSubscribers
) was deprecated in Marathon 1.4 and is removed in this version.
Please move to the /v2/events
endpoint instead.
Removed command line parameters
- The command line flag
max_tasks_per_offer
has been deprecated since 1.4 and is removed now. Please usemax_instances_per_offer
.
Deprecated command line parameters
- The command line flag
save_tasks_to_launch_timeout
is deprecated and has no effect any longer.
Overview
Networking Improvements Involving Multiple Container Networks
The field networkNames
has been added to app container's ContainerPortMapping and pod's Endpoint. Using the field, an app or pod participating in multiple container networks can now forward ports by specifying a single item networkNames
. For more information, see the networking documentation.
Additionally container port discovery has been improved, with a pod or app being able specify with which container network(s) a port name/protocol/etc is associated. Discovery labels are now generated for container networks associated with ports.
Mesos Bridge Network Name Configurable
The CNI network used for Mesos containers when bridge networking is now configurable via the command-line argument --mesos_bridge_name
. As with other command-line-args, this can also be specified via MARATHON_MESOS_BRIDGE_NAME
, as well.
Backup and Restore Operations
You can now backup and restore Marathon's internal state via the DELETE /v2/leader API endpoint.
See MARATHON-7041
TTY support
You can now specify that a TTY should be allocated for app or pod containers. See the TTY definition. An example can be found in v2/examples/app.json.
See MARATHON-7062
Improved Validation Error Messages
All validation specified in the RAML is now programatically enforced, leading to more consistent, descriptive, and legible error messages.
Security improvements
Marathon is in better compliance with various security best-practices. An example of this is that Marathon no longer responds to the directory listing request.
File-based secrets
Marathon has a pluggable interface for secret store providers.
Previous versions of Marathon allowed secrets to be passed as environment variables.
With this version it is also possible to provide secrets as volumes, mounted under a specified path.
See file based secret documentation
Changes around unreachableStrategy
Recent changes in Apache Mesos introduced the ability to handle intermittent connectivity to an agent which may be running a Marathon task. This change introduced the TASK_UNREACHABLE
. This allows for the ability for a node to disconnect and reconnect to the cluster without having a task replaced. This resulted in (based on default configurations) of a delay of 75 seconds before Marathon would be notified by Mesos to replace the task. The previous behavior of Marathon was usually sub-second replacement of a lost task.
It is now possible to configure unreachableStrategy
for apps and pods to instantly replace unreachable apps or pods. To enable this behavior, you need to configure your app or pod as shown below:
{
...
"unreachableStrategy": {
"inactiveAfterSeconds": 0,
"expungeAfterSeconds": 0
},
...
}
Note: Instantly means as soon as marathon becomes aware of the unreachable task. By default, Marathon is notified after 75 seconds by Mesos
that an agent is disconnected. You can change this duration in Mesos by configuring agent_ping_timeout
and max_agent_ping_timeouts
.
Migrating unreachableStrategy
If you want all of your apps and pods to adopt a UnreachableStrategy
that retains the previous behavior where instance were immediately replaced so that you does not have to update every single app definition.
To change the unreachableStrategy
of all apps and pods, set the environment variable MIGRATION_1_4_6_UNREACHABLE_STRATEGY
to true
, which leads to the following behavior during migration:
When opting in to the unreachable migration step
- all app and pod definitions that had a config of
UnreachableStrategy(300 seconds, 600 seconds)
(previous default) are migrated to haveUnreachableStrategy(0 seconds, 0 seconds)
- all app and pod definitions that had a config of
UnreachableStrategy(1 second, x seconds)
are migrated to haveUnreachableStrategy(0 seconds, x seconds)
- all app and pod definitions that had a config of
UnreachableStrategy(1 second, 2 seconds)
are migrated to haveUnreachableStrategy(0 seconds, 0 seconds)
Note: If you set this variable after upgrading to 1.4.6, it will have no effect. Also, the UnreachableStrategy
default has not been changed, so in order for apps and pods created in the future to have the replace-instantly behavior, unreachableStrategy
's inactiveAfterSeconds
and expungeAfterSeconds
must be set to 0 as seen in the JSON above.
Fixed issues
- [MARATHON-7320](https://jira.mesosphere.com/browse/MARATHON-7...