Per #3877, the code to wait for spot instance requests to finish would
hang for the full wait time if any spot request failed for any reason.
This commit introduces status checks for spot requests, so if the
request fails, finishes, or is cancelled the task will fail/succeed
accordingly.
One edge case introduced here is tha if a user terminates the instance
associated with the request manually it won't fail the play, under the
presumption that the user *wants* the instance terminated.
Previously calculation of the number of instances that have been
terminated assumed all instances were in the first reservation returned
by AWS. If this is not the case the calculated number of instances
terminated never reaches the number of instances and the module always
times out. By unpacking the instances we get an accurate number and the
module correctly exits.
Currently instances with multiple ENI's can't be started or stopped
because sourceDestCheck is a per-interface attribute, but we use the
boto global access to it (which only works when there's a single ENI).
This patch handles multiple ENI's and applies the sourcedestcheck across
all interfaces the same way.
Fixes#3234
If you apply `wait=yes` and use `instance_tags` as your filter for
stopping/starting EC2 instances, this stack trace happens:
```
An exception occurred during task execution. The full traceback is: │~
Traceback (most recent call last): │~
File "/tmp/ryansb/ansible_FwE8VR/ansible_module_ec2.py", line 1540, in <module> │~
main() │~
File "/tmp/ryansb/ansible_FwE8VR/ansible_module_ec2.py", line 1514, in main │~
(changed, instance_dict_array, new_instance_ids) = startstop_instances(module, ec2, instance_ids, state, instance_tags) │~
File "/tmp/ryansb/ansible_FwE8VR/ansible_module_ec2.py", line 1343, in startstop_instances │~
if len(matched_instances) < len(instance_ids): │~
TypeError: object of type 'NoneType' has no len() │~
│~
fatal: [localhost -> localhost]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_name": "ec2"}, "module_stderr": "Traceb│~
ack (most recent call last):\n File \"/tmp/ryansb/ansible_FwE8VR/ansible_module_ec2.py\", line 1540, in <module>\n main()\n File \"/tmp/│~
ryansb/ansible_FwE8VR/ansible_module_ec2.py\", line 1514, in main\n (changed, instance_dict_array, new_instance_ids) = startstop_instances│~
(module, ec2, instance_ids, state, instance_tags)\n File \"/tmp/ryansb/ansible_FwE8VR/ansible_module_ec2.py\", line 1343, in startstop_insta│~
nces\n if len(matched_instances) < len(instance_ids):\nTypeError: object of type 'NoneType' has no len()\n", "module_stdout": "", "msg": "│~
MODULE FAILURE", "parsed": false}
```
That's because the `instance_ids` variable is None if not supplied
in the task. That means the instances that result from the instance_tags
query aren't going to be included in the wait loop. To fix this, a list
needs to be kept of instances with matching tags and that list needs to
be added to `instance_ids` before the wait loop.
Before this, all spot instance requests would fail because the code
_always_ called module.fail_json when the parameter was set (which it
always was, because the module parameter's default was set to 'stop').
As the comment said, this parameter doesn't make sense for spot
instances at all, so the error message was also misleading.
The `source_dest_check` and `termination_protection` variables are being
assigned twice in ec2.py, likely due to an incorrect merge somewhere
along the line.
'exact_count' and 'state' are mutually exclusive options they should not be in the following examples:
- # Enforce that 5 running instances named "database" with a "dbtype" of "postgres" example and
- # Enforce that 5 instances with a tag "foo" are running
When this was treated as a boolean, sphinx was leaving the Default
column on http://docs.ansible.com/ansible/ec2_module.html blank,
implying it would use AWS's default. In reality, it passes False, which
overrides the defaults at AWS (it's possible to boot an instance which
AWS claims will always have EBS optimization without it because of this
silently passed False).
Without this, «ec2: state=stopped instance_ids=…» would fail with a
traceback like this:
if inst.get_attribute('sourceDestCheck')['sourceDestCheck'] != source_dest_check:
NameError: global name 'source_dest_check' is not defined
Both `source_dest_check` and `termination_protection` variables are not
available within the scope of the startstopec2 instance method. This just
pulls them from module.params.