Commit Graph

76 Commits

Author SHA1 Message Date
tungol
55c589072f
Admin: sorting imports with ruff (#7075) 2023-11-30 14:55:51 -01:00
Chih-Hsuan Yen
3eed5c2d68
Batch: mark job test as requiring docker (#7048) 2023-11-20 18:28:05 -01:00
Bert Blommers
a23ac8bdff
Batch: submit_job() now returns the jobArn-attribute (#6944) 2023-10-23 19:43:29 +00:00
rafcio19
e5944307fc
Batch: arraySize and child jobs (#6541) 2023-10-12 14:06:57 +00:00
Akira Noda
121ad974b8
Batch: Using enum for job status (#6789) 2023-09-08 15:23:57 +00:00
Bert Blommers
bc29ae2fc3
Techdebt: Update TF tests (#6661) 2023-08-21 20:33:16 +00:00
Bert Blommers
8e35eedc3d
Batch: create_compute_environment() now validates instanceRole and minvCpu (#6470) 2023-07-01 10:33:21 +00:00
Bert Blommers
3741058242
Techdebt: Replace sure with regular asserts in Batch (#6413) 2023-06-16 10:42:07 +00:00
Hans Donner
18ec0c5467
Techdebt: skip tests when docker is not running (#6026) 2023-03-12 15:54:50 -01:00
Bert Blommers
ef1fab008a
ECS: Various improvements (#5880) 2023-01-29 22:47:50 -01:00
Bert Blommers
2f8a356b3f
Batch: add SchedulingPolicy methods (#5877) 2023-01-26 14:06:50 -01:00
Bert Blommers
a11cc558db
Batch: Return RequestId for all operations (#5870) 2023-01-24 14:50:10 -01:00
Tristan Rice
a17956927f
Batch: add multinode support (#5840) 2023-01-14 15:02:32 -01:00
Roc Granada Verdú
6da12892a3
Batch simple jobs inform log stream name (#5825) 2023-01-11 19:45:09 -01:00
Bert Blommers
27a2e42d9b
Admin: Update Docs to point to getmoto (#5826) 2023-01-07 10:35:14 -01:00
bmaisonn
4844af09cc
Batch: add jobname validation (#5720) 2022-11-29 08:44:25 -01:00
Bert Blommers
1a8ddc0f2b
Techdebt: Replace string-format with f-strings (for tests dirs) (#5678) 2022-11-17 21:41:08 -01:00
Bert Blommers
6f3b250fc7
TechDebt: MyPy Batch (#5592) 2022-10-23 13:26:55 +00:00
rav-evax
cde5537b85
Batch: align cancel_job and terminate_job (#5394) 2022-08-23 21:20:55 +00:00
Bert Blommers
12421068bd
Feature: Resource State Transition (#4734) 2022-05-01 11:45:59 +00:00
Bert Blommers
29d01c35bc
Update Black + formatting (#4926) 2022-03-10 13:39:59 -01:00
Bert Blommers
f0bb052343 Batch - Add Attempts to JobDescription 2022-02-20 22:35:58 -01:00
Bert Blommers
24afea36c0 Batch - TaskDefinition improvements 2022-02-20 19:53:25 -01:00
Bert Blommers
ca7bc9273a Batch - JobQueue improvements + Tag support 2022-02-20 13:01:29 -01:00
Bert Blommers
e1ffd27201
Increase Batch timeouts, in case Docker takes a while to start (#4827) 2022-02-04 20:10:46 -01:00
Adam Faulconbridge
3dfda9c1c9
validate containerProperties as strings (#4809) 2022-01-29 11:07:10 -01:00
Todd Morse
bbe4402b33
Add host to batch (#4801) 2022-01-27 22:25:18 -01:00
Bert Blommers
e020b06016
Batch:list_jobs() - extend list of return fields (#4727) 2021-12-28 13:02:18 -01:00
Bert Blommers
41de9b82ac
Batch - implement attemptDurationSeconds (#4636)
* Batch - implement attemptDurationSeconds

* Batch tests - make job def names unique
2021-11-26 22:25:53 -08:00
Brian Pandola
8b0a6f3d27
Allow batch job definition tags to be updated (#4620)
Fixes #4618
2021-11-22 16:47:35 -08:00
Adam Richie-Halford
f4e62f0dfd
ENH: Add resource_requirements to batch job definition (#4506) 2021-11-01 09:31:22 -01:00
Vincent Barbaresi
b7560c9ad2
Use a different method to compute a timestamp in milliseconds for describe output() (#4476) 2021-10-28 09:28:45 +00:00
Vincent Barbaresi
7e3db1ecac
Fix #4228: support Fargate batch compute environment (#4477) 2021-10-26 12:27:24 +00:00
Bert Blommers
14a69c7524
Techdebt: Enable pylint rules (#4432) 2021-10-18 19:44:29 +00:00
Bert Blommers
24ed6c8d34
Add support for AWS China endpoints (#3661) 2021-10-18 16:13:08 +00:00
Bert Blommers
88c6a2f6db
Update test_batch_jobs.py 2021-10-09 21:31:10 +00:00
Bert Blommers
8526013e61
Parallelize tests - Part 1 (#4368) 2021-10-05 17:11:07 +00:00
oakbramble
30c8c3de1f
Deregister batch job definition by 'name:revision' (#4355) 2021-09-27 17:19:44 +00:00
oakbramble
82158096d6
Add tagging to batch job definitions (#4316) 2021-09-21 16:12:18 +00:00
Bert Blommers
d278fd6eaa
Batch - remove duplicate tests (#4208) 2021-08-22 15:14:46 +01:00
Bert Blommers
914d07027f
Feature: Batch: cancel_job (#3769) 2021-08-22 12:29:23 +01:00
Bert Blommers
ee6f20e376
Batch - Test rework (#4134) 2021-08-04 13:40:10 +01:00
Thomas Maschler
d635c78bd1
AWS Batch enhancements (#3956)
* Check exit status of container

* Added support for job dependencies

* batch container overrides

* add AWS_BATCH_JOB_ID to container env variables

* lint with black

* refactor batch dependency test

* refactor batch dependency test

* fix index

Co-authored-by: jterry64 <justin.terry@wri.org>
Co-authored-by: Daniel Mannarino <daniel.mannarino@gmail.com>
2021-05-26 08:52:09 +01:00
Brian Pandola
f7467164e4
Fix Race Condition in batch:SubmitJob (#3480)
* Extract Duplicate Code into Helper Method

DRY up the tests and replace the arbitrary `sleep()` calls with a more
explicit check before progressing.

* Improve Testing of batch:TerminateJob

The test now confirms that the job was terminated by sandwiching a `sleep`
command between two `echo` commands.  In addition to the original checks
of the terminated job status/reason, the test now asserts that only the
first echo command succeeded, confirming that the job was indeed terminated
while in progress.

* Fix Race Condition in batch:SubmitJob

The `test_submit_job` in `test_batch.py` kicks off a job, calls `describe_jobs`
in a loop until the job status returned is SUCCEEDED, and then asserts against
the logged events.

The backend code that runs the submitted job does so in a separate thread. If
the job was successful, the job status was being set to SUCCEEDED *before* the
event logs had been written to the logging backend.

As a result, it was possible for the primary thread running the test to detect
that the job was successful immediately after the secondary thread had updated
the job status but before the secondary thread had written the logs to the
logging backend.  Under the right conditions, this could cause the subsequent
logging assertions in the primary thread to fail.

Additionally, the code that collected the logs from the container was using
a "dodgy hack" of time.sleep() and a modulo-based conditional that was
ultimately non-deterministic and could result in log messages being dropped
or duplicated in certain scenarios.

In order to address these issues, this commit does the following:

* Carefully re-orders any code that sets a job status or timestamp
  to avoid any obvious race conditions.
* Removes the "dodgy hack" in favor of a much more straightforward
  (and less error-prone) method of collecting logs from the container.
* Removes arbitrary and unnecessary calls to time.sleep()

Before applying any changes, the flaky test was failing about 12% of the
time.  Putting a sleep() call between setting the `job_status` to SUCCEEDED
and collecting the logs, resulted in a 100% failure rate.  Simply moving
the code that sets the job status to SUCCEEDED to the end of the code block,
dropped the failure rate to ~2%.  Finally, removing the log collection
hack allowed the test suite to run ~1000 times without a single failure.

Taken in aggregate, these changes make the batch backend more deterministic
and should put the nail in the coffin of this flaky test.

Closes #3475
2020-11-18 10:49:25 +00:00
Bert Blommers
273ca63d59 Linting 2020-11-11 15:55:37 +00:00
Bert Blommers
cb6731f340 Convert fixtures/exceptions to Pytest 2020-11-11 15:54:01 +00:00
Matěj Cepl
ea489bce6c Finish porting from nose to pytest. 2020-11-10 08:25:05 +01:00
Matěj Cepl
77dc60ea97 Port test suite from nose to pytest.
This just eliminates all errors on the tests collection. Elimination of
failures is left to the next commit.
2020-11-10 08:23:44 +01:00
Bert Blommers
db1d7123f6 List dependencies for services - add integration test to verify 2020-09-13 16:08:23 +01:00
Bert Blommers
bed769a387
Tech debt - increase test timeouts to remove intermittant test failures (#3146) 2020-07-17 12:11:47 +01:00